public class PDDocument extends Object implements Closeable
Constructor and Description |
---|
PDDocument()
Creates an empty PDF document.
|
PDDocument(org.apache.pdfbox.cos.COSDocument doc)
Constructor that uses an existing document.
|
PDDocument(org.apache.pdfbox.cos.COSDocument doc,
org.apache.pdfbox.io.RandomAccessRead source)
Constructor that uses an existing document.
|
PDDocument(org.apache.pdfbox.cos.COSDocument doc,
org.apache.pdfbox.io.RandomAccessRead source,
org.apache.pdfbox.pdmodel.encryption.AccessPermission permission)
Constructor that uses an existing document.
|
PDDocument(org.apache.pdfbox.io.RandomAccessStreamCache.StreamCacheCreateFunction streamCacheCreateFunction)
Creates an empty PDF document.
|
Modifier and Type | Method and Description |
---|---|
void |
addFontToSubset(PDFont font)
add the font to be subset
|
void |
addPage(org.apache.pdfbox.pdmodel.PDPage page)
This will add a page to the document.
|
void |
addSignature(org.apache.pdfbox.pdmodel.interactive.digitalsignature.PDSignature sigObject)
Add parameters of signature to be created externally using default signature options.
|
void |
addSignature(org.apache.pdfbox.pdmodel.interactive.digitalsignature.PDSignature sigObject,
org.apache.pdfbox.pdmodel.interactive.digitalsignature.SignatureInterface signatureInterface)
Add a signature to be created using the instance of given interface.
|
void |
addSignature(org.apache.pdfbox.pdmodel.interactive.digitalsignature.PDSignature sigObject,
org.apache.pdfbox.pdmodel.interactive.digitalsignature.SignatureInterface signatureInterface,
org.apache.pdfbox.pdmodel.interactive.digitalsignature.SignatureOptions options)
This will add a signature to the document.
|
void |
addSignature(org.apache.pdfbox.pdmodel.interactive.digitalsignature.PDSignature sigObject,
org.apache.pdfbox.pdmodel.interactive.digitalsignature.SignatureOptions options)
Add parameters of signature to be created externally.
|
void |
close()
This will close the underlying COSDocument object.
|
org.apache.pdfbox.pdmodel.encryption.AccessPermission |
getCurrentAccessPermission()
Returns the access permissions granted when the document was decrypted.
|
org.apache.pdfbox.cos.COSDocument |
getDocument()
This will get the low level document.
|
org.apache.pdfbox.pdmodel.PDDocumentCatalog |
getDocumentCatalog()
This will get the document CATALOG.
|
Long |
getDocumentId()
Provides the document ID.
|
org.apache.pdfbox.pdmodel.PDDocumentInformation |
getDocumentInformation()
This will get the document info dictionary.
|
org.apache.pdfbox.pdmodel.encryption.PDEncryption |
getEncryption()
This will get the encryption dictionary for this document.
|
org.apache.pdfbox.pdmodel.interactive.digitalsignature.PDSignature |
getLastSignatureDictionary()
This will return the last signature from the field tree.
|
int |
getNumberOfPages()
This will return the total page count of the PDF document.
|
org.apache.pdfbox.pdmodel.PDPage |
getPage(int pageIndex)
Returns the page at the given 0-based index.
|
org.apache.pdfbox.pdmodel.PDPageTree |
getPages()
Returns the page tree.
|
org.apache.pdfbox.pdmodel.ResourceCache |
getResourceCache()
Returns the resource cache associated with this document, or null if there is none.
|
List<org.apache.pdfbox.pdmodel.interactive.digitalsignature.PDSignature> |
getSignatureDictionaries()
Retrieve all signature dictionaries from the document.
|
List<org.apache.pdfbox.pdmodel.interactive.form.PDSignatureField> |
getSignatureFields()
Retrieve all signature fields from the document.
|
float |
getVersion()
Returns the PDF specification version this document conforms to.
|
org.apache.pdfbox.pdmodel.PDPage |
importPage(org.apache.pdfbox.pdmodel.PDPage page)
This will import and copy the contents from another location.
|
boolean |
isAllSecurityToBeRemoved()
Indicates if all security is removed or not when writing the pdf.
|
boolean |
isEncrypted()
This will tell if this document is encrypted or not.
|
void |
protect(org.apache.pdfbox.pdmodel.encryption.ProtectionPolicy policy)
Protects the document with a protection policy.
|
void |
registerTrueTypeFontForClosing(org.apache.fontbox.ttf.TrueTypeFont ttf)
For internal PDFBox use when creating PDF documents: register a TrueTypeFont to make sure it is closed when the
PDDocument is closed to avoid memory leaks.
|
void |
removePage(int pageNumber)
Remove the page from the document.
|
void |
removePage(org.apache.pdfbox.pdmodel.PDPage page)
Remove the page from the document.
|
void |
save(File file)
Save the document to a file using default compression.
|
void |
save(File file,
org.apache.pdfbox.pdfwriter.compress.CompressParameters compressParameters)
Save the document using the given compression.
|
void |
save(OutputStream output)
This will save the document to an output stream.
|
void |
save(OutputStream output,
org.apache.pdfbox.pdfwriter.compress.CompressParameters compressParameters)
Save the document using the given compression.
|
void |
save(String fileName)
Save the document to a file using default compression.
|
void |
save(String fileName,
org.apache.pdfbox.pdfwriter.compress.CompressParameters compressParameters)
Save the document to a file using the given compression.
|
void |
saveIncremental(OutputStream output)
Save the PDF as an incremental update.
|
void |
saveIncremental(OutputStream output,
Set<org.apache.pdfbox.cos.COSDictionary> objectsToWrite)
Save the PDF as an incremental update.
|
org.apache.pdfbox.pdmodel.interactive.digitalsignature.ExternalSigningSupport |
saveIncrementalForExternalSigning(OutputStream output)
Save PDF incrementally without closing for external signature creation scenario.
|
void |
setAllSecurityToBeRemoved(boolean removeAllSecurity)
Activates/Deactivates the removal of all security when writing the pdf.
|
void |
setDocumentId(Long docId)
Sets the document ID to the given value.
|
void |
setDocumentInformation(org.apache.pdfbox.pdmodel.PDDocumentInformation info)
This will set the document information for this document.
|
void |
setEncryptionDictionary(org.apache.pdfbox.pdmodel.encryption.PDEncryption encryption)
This will set the encryption dictionary for this document.
|
void |
setResourceCache(org.apache.pdfbox.pdmodel.ResourceCache resourceCache)
Sets the resource cache associated with this document.
|
void |
setVersion(float newVersion)
Sets the PDF specification version for this document.
|
public PDDocument()
public PDDocument(org.apache.pdfbox.io.RandomAccessStreamCache.StreamCacheCreateFunction streamCacheCreateFunction)
streamCacheCreateFunction
- a function to create an instance of a stream cache for buffering PDF streamspublic PDDocument(org.apache.pdfbox.cos.COSDocument doc)
doc
- The COSDocument that this document wraps.public PDDocument(org.apache.pdfbox.cos.COSDocument doc, org.apache.pdfbox.io.RandomAccessRead source)
doc
- The COSDocument that this document wraps.source
- input representing the pdfpublic PDDocument(org.apache.pdfbox.cos.COSDocument doc, org.apache.pdfbox.io.RandomAccessRead source, org.apache.pdfbox.pdmodel.encryption.AccessPermission permission)
doc
- The COSDocument that this document wraps.source
- input representing the pdfpermission
- he access permissions of the pdfpublic void addPage(org.apache.pdfbox.pdmodel.PDPage page)
page
- The page to add to the document.public void addSignature(org.apache.pdfbox.pdmodel.interactive.digitalsignature.PDSignature sigObject) throws IOException
saveIncrementalForExternalSigning(OutputStream)
method description on external
signature creation scenario details.
Only one signature may be added in a document. To sign several times, load document, add signature, save incremental and close again.
sigObject
- is the PDSignatureField modelIOException
- if there is an error creating required fieldsIllegalStateException
- if one attempts to add several signature
fields.public void addSignature(org.apache.pdfbox.pdmodel.interactive.digitalsignature.PDSignature sigObject, org.apache.pdfbox.pdmodel.interactive.digitalsignature.SignatureOptions options) throws IOException
saveIncrementalForExternalSigning(OutputStream)
method description on external
signature creation scenario details.
Only one signature may be added in a document. To sign several times, load document, add signature, save incremental and close again.
sigObject
- is the PDSignatureField modeloptions
- signature optionsIOException
- if there is an error creating required fieldsIllegalStateException
- if one attempts to add several signature
fields.public void addSignature(org.apache.pdfbox.pdmodel.interactive.digitalsignature.PDSignature sigObject, org.apache.pdfbox.pdmodel.interactive.digitalsignature.SignatureInterface signatureInterface) throws IOException
Only one signature may be added in a document. To sign several times, load document, add signature, save incremental and close again.
sigObject
- is the PDSignatureField modelsignatureInterface
- is an interface whose implementation provides
signing capabilities. Can be null if external signing if used.IOException
- if there is an error creating required fieldsIllegalStateException
- if one attempts to add several signature
fields.public void addSignature(org.apache.pdfbox.pdmodel.interactive.digitalsignature.PDSignature sigObject, org.apache.pdfbox.pdmodel.interactive.digitalsignature.SignatureInterface signatureInterface, org.apache.pdfbox.pdmodel.interactive.digitalsignature.SignatureOptions options) throws IOException
Only one signature may be added in a document. To sign several times, load document, add signature, save incremental and close again.
sigObject
- is the PDSignatureField modelsignatureInterface
- is an interface whose implementation provides
signing capabilities. Can be null if external signing if used.options
- signature optionsIOException
- if there is an error creating required fieldsIllegalStateException
- if one attempts to add several signature
fields.public void removePage(org.apache.pdfbox.pdmodel.PDPage page)
page
- The page to remove from the document.public void removePage(int pageNumber)
pageNumber
- 0 based index to page number.public org.apache.pdfbox.pdmodel.PDPage importPage(org.apache.pdfbox.pdmodel.PDPage page) throws IOException
addPage()
method.
Unlike addPage()
, this method creates a new PDPage object. If your page has
annotations, and if these link to pages not in the target document, then the target document
might become huge. What you need to do is to delete page references of such annotations. See
here for how to do this.
Inherited (global) resources are ignored because these can contain resources not needed for
this page which could bloat your document, see
PDFBOX-28 and related issues.
If you need them, call importedPage.setResources(page.getResources());
This method should only be used to import a page from a loaded document, not from a generated document because these can contain unfinished parts, e.g. font subsetting information.
page
- The page to import.IOException
- If there is an error copying the page.public org.apache.pdfbox.cos.COSDocument getDocument()
public org.apache.pdfbox.pdmodel.PDDocumentInformation getDocumentInformation()
In PDF 2.0 this is deprecated except for two entries, /CreationDate and /ModDate. For any other
document level metadata, a metadata stream should be used instead, see
PDDocumentCatalog.getMetadata()
.
public void setDocumentInformation(org.apache.pdfbox.pdmodel.PDDocumentInformation info)
In PDF 2.0 this is deprecated except for two entries, /CreationDate and /ModDate. For any other
document level metadata, a metadata stream should be used instead, see
PDDocumentCatalog#setMetadata(PDMetadata)
.
info
- The updated document information.public org.apache.pdfbox.pdmodel.PDDocumentCatalog getDocumentCatalog()
public boolean isEncrypted()
public org.apache.pdfbox.pdmodel.encryption.PDEncryption getEncryption()
public void setEncryptionDictionary(org.apache.pdfbox.pdmodel.encryption.PDEncryption encryption)
encryption
- The encryption dictionary(most likely a PDStandardEncryption object)public org.apache.pdfbox.pdmodel.interactive.digitalsignature.PDSignature getLastSignatureDictionary()
PDSignatureField
.public List<org.apache.pdfbox.pdmodel.interactive.form.PDSignatureField> getSignatureFields()
List
of PDSignatureField
spublic List<org.apache.pdfbox.pdmodel.interactive.digitalsignature.PDSignature> getSignatureDictionaries()
List
of PDSignatureField
spublic void registerTrueTypeFontForClosing(org.apache.fontbox.ttf.TrueTypeFont ttf)
ttf
- the TrueTypeFont to be registeredpublic void addFontToSubset(PDFont font)
font
- the font to be subsetpublic void save(String fileName) throws IOException
Don't use the input file as target as this will produce a corrupted file.
If encryption has been activated (with protect(ProtectionPolicy)
), do not use the document after saving because the contents are now encrypted.
fileName
- The file to save as.IOException
- if the output could not be writtenpublic void save(File file) throws IOException
Don't use the input file as target as this will produce a corrupted file.
If encryption has been activated (with protect(ProtectionPolicy)
), do not use the document after saving because the contents are now encrypted.
file
- The file to save as.IOException
- if the output could not be writtenpublic void save(OutputStream output) throws IOException
Don't use the input file as target as this will produce a corrupted file.
If encryption has been activated (with protect(ProtectionPolicy)
), do not use the document after saving because the contents are now encrypted.
output
- The stream to write to. It is recommended to wrap it in a BufferedOutputStream
,
unless it is already buffered.IOException
- if the output could not be writtenpublic void save(File file, org.apache.pdfbox.pdfwriter.compress.CompressParameters compressParameters) throws IOException
Don't use the input file as target as this will produce a corrupted file.
If encryption has been activated (with protect(ProtectionPolicy)
), do not use the document after saving because the contents are now encrypted.
file
- The file to save as.compressParameters
- The parameters for the document's compression.IOException
- if the output could not be writtenpublic void save(String fileName, org.apache.pdfbox.pdfwriter.compress.CompressParameters compressParameters) throws IOException
Don't use the input file as target as this will produce a corrupted file.
If encryption has been activated (with protect(ProtectionPolicy)
), do not use the document after saving because the contents are now encrypted.
fileName
- The file to save as.compressParameters
- The parameters for the document's compression.IOException
- if the output could not be writtenpublic void save(OutputStream output, org.apache.pdfbox.pdfwriter.compress.CompressParameters compressParameters) throws IOException
Don't use the input file as target as this will produce a corrupted file.
If encryption has been activated (with protect(ProtectionPolicy)
), do not use the document after saving because the contents are now encrypted.
output
- The stream to write to. It is recommended to wrap it in a BufferedOutputStream
,
unless it is already buffered.compressParameters
- The parameters for the document's compression.IOException
- if the output could not be writtenpublic void saveIncremental(OutputStream output) throws IOException
COSUpdateInfo.isNeedToBeUpdated()
set, starting from the document catalog. For signatures this is taken
care by PDFBox itself.
Other usages of this method are for experienced users only. You will usually never need it. It is useful only if you are required to keep the current revision and append the changes. A typical use case is changing a signed file without invalidating the signature.
If your modification includes annotations, make sure these link back to their page by calling
PDAnnotation.setPage(PDPage)
. Although this is optional,
not doing it
can cause trouble when PDFs get
signed. (PDFBox already does this for signature widget annotations)
Another problem with page-based modifications can occur if the page tree isn't flat: there won't be an closed update path from the catalog to the page. To fix this, add code like this:
COSDictionary parent = page.getCOSObject().getCOSDictionary(COSName.PARENT);
while (parent != null)
{
parent.setNeedToBeUpdated(true);
parent = parent.getCOSDictionary(COSName.PARENT);
}
Don't use the input file as target as this will produce a corrupted file.
output
- stream to write to. It will be closed when done. It must never point to the source
file or that one will be harmed!IOException
- if the output could not be writtenIllegalStateException
- if the document was not loaded from a file or a stream.public void saveIncremental(OutputStream output, Set<org.apache.pdfbox.cos.COSDictionary> objectsToWrite) throws IOException
COSUpdateInfo.isNeedToBeUpdated()
set so the incremental update gets smaller. Only dictionaries
are supported; if you need to update other objects classes, then add their parent dictionary.
This method is for experienced users only. You will usually never need it. It is useful only if you are required to keep the current revision and append the changes. A typical use case is changing a signed file without invalidating the signature. To know which objects are getting changed, you need to have some understanding of the PDF specification, and look at the saved file with an editor to verify that you are updating the correct objects. You should also inspect the page and document structures of the file with PDFDebugger.
If your modification includes annotations, make sure these link back to their page by calling
PDAnnotation.setPage(PDPage)
. Although this is optional,
not doing it
can cause trouble when PDFs get
signed. (PDFBox already does this for signature widget annotations)
Don't use the input file as target as this will produce a corrupted file.
output
- stream to write to. It will be closed when done. It must never point to the source
file or that one will be harmed!objectsToWrite
- objects that must be part of the incremental saving.IOException
- if the output could not be writtenIllegalStateException
- if the document was not loaded from a file or a stream.public org.apache.pdfbox.pdmodel.interactive.digitalsignature.ExternalSigningSupport saveIncrementalForExternalSigning(OutputStream output) throws IOException
PDDocument pdDocument = ...; OutputStream outputStream = ...; SignatureOptions signatureOptions = ...; // options to specify fine tuned signature options or null for defaults PDSignature pdSignature = ...; // add signature parameters to be used when creating signature dictionary pdDocument.addSignature(pdSignature, signatureOptions); // prepare PDF for signing and obtain helper class to be used ExternalSigningSupport externalSigningSupport = pdDocument.saveIncrementalForExternalSigning(outputStream); // get data to be signed InputStream dataToBeSigned = externalSigningSupport.getContent(); // invoke signature service byte[] signature = sign(dataToBeSigned); // set resulted CMS signature externalSigningSupport.setSignature(signature); // last step is to close the document pdDocument.close();
Note that after calling this method, only close()
method may invoked for PDDocument
instance and
only AFTER ExternalSigningSupport
instance is used.
Don't use the input file as target as this will produce a corrupted file.
output
- stream to write the final PDF. It will be closed when the document is closed. It must
never point to the source file or that one will be harmed!IOException
- if the output could not be writtenIllegalStateException
- if the document was not loaded from a file or a stream or signature options were
not set.public org.apache.pdfbox.pdmodel.PDPage getPage(int pageIndex)
This method is too slow to get all the pages from a large PDF document
(1000 pages or more). For such documents, use the iterator of
getPages()
instead.
pageIndex
- the 0-based page indexpublic org.apache.pdfbox.pdmodel.PDPageTree getPages()
public int getNumberOfPages()
public void close() throws IOException
close
in interface Closeable
close
in interface AutoCloseable
IOException
- If there is an error releasing resources.public void protect(org.apache.pdfbox.pdmodel.encryption.ProtectionPolicy policy) throws IOException
setAllSecurityToBeRemoved(boolean)
with a false argument if it was set to true
previously and logs a warning.
Do not use the document after saving, because the structures are encrypted.
policy
- The protection policy.IOException
- if there isn't any suitable security handler.StandardProtectionPolicy
,
PublicKeyProtectionPolicy
public org.apache.pdfbox.pdmodel.encryption.AccessPermission getCurrentAccessPermission()
public boolean isAllSecurityToBeRemoved()
public void setAllSecurityToBeRemoved(boolean removeAllSecurity)
removeAllSecurity
- remove all security if set to truepublic Long getDocumentId()
COSDocument.getDocumentID()
for the trailer document ID. Read
PDFBOX-1613 for more details
about the purpose.public void setDocumentId(Long docId)
COSDocument.setDocumentID(COSArray)
for the trailer document ID. Read
PDFBOX-1613 for more details
about the purpose.docId
- the new document IDpublic float getVersion()
public void setVersion(float newVersion)
newVersion
- the new PDF version (e.g. 1.4f)public org.apache.pdfbox.pdmodel.ResourceCache getResourceCache()
public void setResourceCache(org.apache.pdfbox.pdmodel.ResourceCache resourceCache)
resourceCache
- A resource cache, or null.Copyright © 2024. All rights reserved.