Welcome, guest | Sign In | My Account | Store | Cart

Inserting Images on PDF Pages (Python recipe) by Jorj X. McKie
ActiveState Code (http://code.activestate.com/recipes/580803/)

Version 1.11.0 of PyMuPDF allows putting an image on an existing PDF page. The following example puts the same image on every page of a given PDF - like a thumbnail.

      import fitz                          # <-- PyMuPDF
doc = fitz.open("some.pdf")          # open the PDF
rect = fitz.Rect(0, 0, 100, 100)     # where to put image: use upper left corner

for page in doc:
    page.insertImage(rect, filename = "some.image")

doc.saveIncr()                       # do an incremental save

      

The above script is very fast: to stamp every page of Adobe's Reference Manual (1.310 pages) like this should take below 10 seconds. This is because our underlying library MuPDF will insert the new image at most one time. Existing images are being checked whether they contain the same picture - if so, only a reference to it will be inserted.

Tags: fitz, mupdf, pdf, pymupdf

3 comments

Hongliang 6 years, 10 months ago # | flag

Great stuff Jorj.

I am new to Python. After the first run which went well (with inserted image), I tried to rerun the code again but the jupyter shows: "Exception: repaired file - save to new"

I searched FITZ's main page here https://pythonhosted.org/PyMuPDF/tutorial.html but did not manage to figure out a solution. I suspect the error is about document closing, but adding doc.close() does not solve the issue. Any insight here? Thanks!

Jorj X. McKie (author) 6 years, 10 months ago # | flag

Hi Hongliang, thanks for your message. Please don't hesitate to open an issue on GitHub ...

However, your case does look weird.

The error message means that the PDF was corrupt in some way, which has been detected and fixed during open. But because of this, the next save cannot be incremental - it needs to be to a new file.

This is not your fault (probably) and it is not a problem on the Python level: the complaint comes from the underlying C library (MuPDF).

Please provide me with more details and I am happy to help:

your example PDF, the script(s) (or manual entries) you were using and any information that might help to re-create the error: Operating system, bitness, Python level, fitz.__doc__ info (contains generation time stamp).

As I said, best would be to use https://github.com/rk700/PyMuPDF/issues for this

Jorj X. McKie (author) 6 years, 10 months ago # | flag

To be more explicit: when I say "save to a new file" I mean you should use doc.save("some-new.pdf") instead of doc.saveIncr().

This creates a new PDF with all your changes. The method save() offers several more parameters which allow compression among other things.

What I still do not understand is: how could your PDF become damaged between the first and the second save?

... awaiting your details :-)

Created by Jorj X. McKie on Wed, 17 May 2017 (MIT)

◄	Python recipes (4591)	►
◄	Jorj X. McKie's recipes (22)	►

Required Modules

pymupdf

Other Information and Tasks

Licensed under the MIT License
Viewed 77013 times
Revision 1

Accounts

Code Recipes

Feedback & Information

ActiveState

© 2024 ActiveState Software Inc. All rights reserved. ActiveState®, Komodo®, ActiveState Perl Dev Kit®, ActiveState Tcl Dev Kit®, ActivePerl®, ActivePython®, and ActiveTcl® are registered trademarks of ActiveState. All other marks are property of their respective owners.

Inserting Images on PDF Pages (Python recipe) by Jorj X. McKie ActiveState Code (http://code.activestate.com/recipes/580803/)