Version 1.11.0 of PyMuPDF allows putting an image on an existing PDF page. The following example puts the same image on every page of a given PDF - like a thumbnail.
1 2 3 4 5 6 7 8 | import fitz # <-- PyMuPDF
doc = fitz.open("some.pdf") # open the PDF
rect = fitz.Rect(0, 0, 100, 100) # where to put image: use upper left corner
for page in doc:
page.insertImage(rect, filename = "some.image")
doc.saveIncr() # do an incremental save
|
The above script is very fast: to stamp every page of Adobe's Reference Manual (1.310 pages) like this should take below 10 seconds. This is because our underlying library MuPDF will insert the new image at most one time. Existing images are being checked whether they contain the same picture - if so, only a reference to it will be inserted.
Great stuff Jorj.
I am new to Python. After the first run which went well (with inserted image), I tried to rerun the code again but the jupyter shows: "Exception: repaired file - save to new"
I searched FITZ's main page here https://pythonhosted.org/PyMuPDF/tutorial.html but did not manage to figure out a solution. I suspect the error is about document closing, but adding doc.close() does not solve the issue. Any insight here? Thanks!
Hi Hongliang, thanks for your message. Please don't hesitate to open an issue on GitHub ...
However, your case does look weird.
The error message means that the PDF was corrupt in some way, which has been detected and fixed during open. But because of this, the next save cannot be incremental - it needs to be to a new file.
This is not your fault (probably) and it is not a problem on the Python level: the complaint comes from the underlying C library (MuPDF).
Please provide me with more details and I am happy to help:
your example PDF, the script(s) (or manual entries) you were using and any information that might help to re-create the error: Operating system, bitness, Python level, fitz.__doc__ info (contains generation time stamp).
As I said, best would be to use https://github.com/rk700/PyMuPDF/issues for this
To be more explicit: when I say "save to a new file" I mean you should use
doc.save("some-new.pdf")
instead ofdoc.saveIncr()
.This creates a new PDF with all your changes. The method
save()
offers several more parameters which allow compression among other things.What I still do not understand is: how could your PDF become damaged between the first and the second save?
... awaiting your details :-)