Version 1.11.0 (based on MuPDF v1.11) allows exporting, importing and interrogating files embedded in a PDF.
PDF "/EmbeddedFiles" are similar to ZIP archives (or the Microsoft OLE technique), allowing arbitrary data to be incorporated in a PDF and benefit from its unique features.
1 2 3 4 5 6 7 8 9 10 11 12 | import fitz # = PyMuPDF
doc = fitz.open("test.pdf") # open the PDF
count = doc.embeddedFileCount
print("number of embedded file:", count) # shows number of embedded files
# get decompressed content of data stored by name "my data"
# also possible to use integer between 0 and "count - 1"
buff = doc.embeddedFileGet("my data")
fout = open("test.file", "wb") # open output file
fout.write(buff)
fout.close()
|
Deletion, reporting, importing, copying between PDFs, etc. is just as simple.
See here for more examples and lightweight utilities:
https://github.com/rk700/PyMuPDF/tree/master/examples
Any Python bitness and Python 3 is fully supported and tested up to and including 3.6. Platforms include at least Windows, Mac and Linux. Ohter platforms should work that are supported by Python and MuPDF.