Welcome, guest | Sign In | My Account | Store | Cart

Version 1.11.0 (based on MuPDF v1.11) allows exporting, importing and interrogating files embedded in a PDF.

PDF "/EmbeddedFiles" are similar to ZIP archives (or the Microsoft OLE technique), allowing arbitrary data to be incorporated in a PDF and benefit from its unique features.

Python, 12 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
import fitz                      # = PyMuPDF
doc = fitz.open("test.pdf")      # open the PDF
count = doc.embeddedFileCount
print("number of embedded file:", count)     # shows number of embedded files

# get decompressed content of data stored by name "my data"
# also possible to use integer between 0 and "count - 1"

buff = doc.embeddedFileGet("my data")
fout = open("test.file", "wb")   # open output file
fout.write(buff)
fout.close()

Deletion, reporting, importing, copying between PDFs, etc. is just as simple.

See here for more examples and lightweight utilities:

https://github.com/rk700/PyMuPDF/tree/master/examples

Any Python bitness and Python 3 is fully supported and tested up to and including 3.6. Platforms include at least Windows, Mac and Linux. Ohter platforms should work that are supported by Python and MuPDF.