Most viewed recipes tagged "pdf" but not "xtopdf"http://code.activestate.com/recipes/tags/pdf-xtopdf/views/2017-07-11T18:57:54-07:00ActiveState Code RecipesInsert a Text Box in a PDF page (fitz / PyMuPDF) (Python)
2017-06-29T22:54:25-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580809-insert-a-text-box-in-a-pdf-page-fitz-pymupdf/
<p style="color: grey">
Python
recipe 580809
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/textbox/">textbox</a>).
</p>
<p>This method inserts text into a predefined rectangular area of a (new or existing) PDF page.
Words are distributed across the available space, put on new lines when required etc. Line breaks and tab characters are respected / resolved.
Text can be aligned in the box (left, center, right) and fonts can be freely chosen.
The method returns a float indicating how vertical space is left over after filling the area.</p>
Create Calendars on PDF with a few lines (Python)
2017-06-13T10:57:34-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580805-create-calendars-on-pdf-with-a-few-lines/
<p style="color: grey">
Python
recipe 580805
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/calendar/">calendar</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>).
Revision 2.
</p>
<p>PyMuPDF (fitz) provides easy to use ways to create PDF documents out of simple texts.</p>
<p>An example is the text output of Python's calendar module. Here we take a starting year as script parameter and output a 3-page (A4 landscape) document with calendars for this and the following two years - in less than 20 lines of code.</p>
Inserting Images on PDF Pages (Python)
2017-05-17T21:10:26-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580803-inserting-images-on-pdf-pages/
<p style="color: grey">
Python
recipe 580803
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>).
</p>
<p>Version 1.11.0 of PyMuPDF allows putting an image on an existing PDF page.
The following example puts the same image on every page of a given PDF - like a thumbnail.</p>
How to handle PDF embedded files with PyMuPDF (Python)
2017-07-11T18:57:54-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580796-how-to-handle-pdf-embedded-files-with-pymupdf/
<p style="color: grey">
Python
recipe 580796
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/embedded_files/">embedded_files</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>).
Revision 3.
</p>
<p>Version 1.11.0 (based on MuPDF v1.11) allows exporting, importing and interrogating files embedded in a PDF.</p>
<p>PDF "/EmbeddedFiles" are similar to ZIP archives (or the Microsoft OLE technique), allowing arbitrary data to be incorporated in a PDF and benefit from its unique features.</p>
How to Create a PDF with a Caustic Drawing (Python)
2017-06-18T17:43:47-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580806-how-to-create-a-pdf-with-a-caustic-drawing/
<p style="color: grey">
Python
recipe 580806
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>).
</p>
<p>Just a little demo on how to create simple drawings with PyMuPDF.</p>
<p>This script simulates what you see looking into your coffee mug, early in the morning after a long night of programming ...</p>
Inserting pages into a PDF with PyMuPDF (Python)
2017-05-17T21:15:26-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580802-inserting-pages-into-a-pdf-with-pymupdf/
<p style="color: grey">
Python
recipe 580802
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/text_conversion/">text_conversion</a>).
Revision 2.
</p>
<p>Version 1.11.0 of PyMuPDF allows creating new PDF pages, as well as inserting images into existing pages.</p>
<p>Here is a script that converts any textfile into a PDF.</p>
Convert doc and docx files to pdf (Python)
2014-03-31T18:39:16-07:00Fabian Mayerhttp://code.activestate.com/recipes/users/4189629/http://code.activestate.com/recipes/578858-convert-doc-and-docx-files-to-pdf/
<p style="color: grey">
Python
recipe 578858
by <a href="/recipes/users/4189629/">Fabian Mayer</a>
(<a href="/recipes/tags/doc/">doc</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/win32com/">win32com</a>).
Revision 2.
</p>
<p>The Script converts all doc and docx files in a specified folder to pdf files. It checks whether the provided absolute path does actually exist and whether the specified folder contains any doc and docx files. It does not travers the directory recursively. The script is not portable and runs only a Windows machine. Based on the experience I made, I recommend closing MS Word before running the script.</p>
PDF Text Extraction using fitz / MuPDF (PyMuPDF) (Python)
2016-03-17T12:00:06-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580626-pdf-text-extraction-using-fitz-mupdf-pymupdf/
<p style="color: grey">
Python
recipe 580626
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/cbz/">cbz</a>, <a href="/recipes/tags/epub/">epub</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/openxps/">openxps</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>, <a href="/recipes/tags/text_extraction/">text_extraction</a>, <a href="/recipes/tags/xps/">xps</a>).
</p>
<p>Extract all the text of a PDF (or other supported container types) at very high speed.
In general, text pieces of a PDF page are not arranged in natural reading order, but in the order they were entered during PDF creation.
This script re-arranges text blocks according to their pixel coordinates to achieve a more readable output, i.e. top-down, left-right.</p>
Crop PDF File with pyPdf (Python)
2011-11-03T17:42:10-07:00ccpizzahttp://code.activestate.com/recipes/users/4170754/http://code.activestate.com/recipes/576837-crop-pdf-file-with-pypdf/
<p style="color: grey">
Python
recipe 576837
by <a href="/recipes/users/4170754/">ccpizza</a>
(<a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pypdf/">pypdf</a>).
Revision 3.
</p>
<p>This recipe was originally posted by <code>sjvr767</code> on <a href="http://www.mobileread.com/forums/showthread.php?t=25565" rel="nofollow">http://www.mobileread.com/forums/showthread.php?t=25565</a> and I decided to also make it available here.</p>
<p>It uses pypdf (<a href="http://pybrary.net/pyPdf/%29" rel="nofollow">http://pybrary.net/pyPdf/)</a></p>
<p>The script is supposed to be run like this:</p>
<p><code>pdf_crop.py" -m "120 50 120 180" -i mypdf.pdf</code></p>
<p>where the margins are <code>left top right bottom</code></p>
<p>To install pyPdf try <code>easy_install pypdf</code>.</p>
Extract images of a PDF - optionally by page using PyMuPDF / fitz (Python)
2016-09-28T12:03:59-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580703-extract-images-of-a-pdf-optionally-by-page-using-p/
<p style="color: grey">
Python
recipe 580703
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/png/">png</a>).
</p>
<p>Two small scripts to extract images contained in a PDF document as PNG files.
(1) Script 1 extracts <strong>all</strong> images
(2) Script 2 extracts only images that are referenced by a page</p>
Improved ReportLab recipe for "page x of y" (Python)
2009-07-06T10:03:28-07:00Vinay Sajiphttp://code.activestate.com/recipes/users/4034162/http://code.activestate.com/recipes/576832-improved-reportlab-recipe-for-page-x-of-y/
<p style="color: grey">
Python
recipe 576832
by <a href="/recipes/users/4034162/">Vinay Sajip</a>
(<a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/reportlab/">reportlab</a>).
Revision 2.
</p>
<p>This recipe is based on <a href="http://code.activestate.com/recipes/546511/"><a href="http://code.activestate.com/recipes/546511/">Recipe 546511</a></a> which does not work reliably if there are images in the content.</p>
wxPython PDF Viewer using Poppler (Python)
2010-04-15T17:43:27-07:00Marcelo Fernándezhttp://code.activestate.com/recipes/users/4173551/http://code.activestate.com/recipes/577195-wxpython-pdf-viewer-using-poppler/
<p style="color: grey">
Python
recipe 577195
by <a href="/recipes/users/4173551/">Marcelo Fernández</a>
(<a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/poppler/">poppler</a>, <a href="/recipes/tags/python_poppler/">python_poppler</a>, <a href="/recipes/tags/viewer/">viewer</a>, <a href="/recipes/tags/wxpython/">wxpython</a>).
</p>
<p>This example shows a PDF Viewer class, which handles things like Zoom and Scrolling. It requires python-poppler and wxPython >= 2.8.9.</p>
How to parse a table in a PDF document (Python)
2016-04-10T22:43:57-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580635-how-to-parse-a-table-in-a-pdf-document/
<p style="color: grey">
Python
recipe 580635
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/cbz/">cbz</a>, <a href="/recipes/tags/epub/">epub</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/openxps/">openxps</a>, <a href="/recipes/tags/parsing/">parsing</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>, <a href="/recipes/tags/table/">table</a>, <a href="/recipes/tags/xps/">xps</a>).
Revision 4.
</p>
<p>A Python function that converts a table contained in a page of a PDF (or OpenXPS, EPUB, CBZ, XPS) document to a matrix-like Python object (list of lists of strings).</p>
PDF a Directory of Images using Reportlab (Python)
2009-04-12T08:35:10-07:00andrew.canithttp://code.activestate.com/recipes/users/4169843/http://code.activestate.com/recipes/576717-pdf-a-directory-of-images-using-reportlab/
<p style="color: grey">
Python
recipe 576717
by <a href="/recipes/users/4169843/">andrew.canit</a>
(<a href="/recipes/tags/directory/">directory</a>, <a href="/recipes/tags/images/">images</a>, <a href="/recipes/tags/pdf/">pdf</a>).
</p>
<p>Walk through a directory PDFing Images</p>
Convert PDF to plain text (Python)
2010-11-25T15:30:52-08:00ccpizzahttp://code.activestate.com/recipes/users/4170754/http://code.activestate.com/recipes/577095-convert-pdf-to-plain-text/
<p style="color: grey">
Python
recipe 577095
by <a href="/recipes/users/4170754/">ccpizza</a>
(<a href="/recipes/tags/converter/">converter</a>, <a href="/recipes/tags/pdf/">pdf</a>).
</p>
<p>This is a very raw PDF converter which has absolutely no idea of the page layout or text positioning.</p>
<p>To install the required module try <code>easy_install pypdf</code> in a console.</p>
How to use Python to convert a web page to PDF with a POST request to SelectPdf Online API and save it on the disk (Python)
2015-11-16T14:52:17-08:00SelectPdfhttp://code.activestate.com/recipes/users/4193129/http://code.activestate.com/recipes/579126-how-to-use-python-to-convert-a-web-page-to-pdf-wit/
<p style="color: grey">
Python
recipe 579126
by <a href="/recipes/users/4193129/">SelectPdf</a>
(<a href="/recipes/tags/api/">api</a>, <a href="/recipes/tags/converter/">converter</a>, <a href="/recipes/tags/htmltopdf/">htmltopdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/selectpdf/">selectpdf</a>).
</p>
<p>This code converts an url to pdf in Python using SelectPdf HTML To PDF REST API through a POST request. The parameters are JSON encoded. The content is saved into a file on the disk.</p>
Rotate a PDF page in 3 lines (Python)
2016-11-06T11:33:59-08:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580713-rotate-a-pdf-page-in-3-lines/
<p style="color: grey">
Python
recipe 580713
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>).
Revision 2.
</p>
<p>PyMuPDF v1.9.3 now supports several new features for manipulating PDFs.</p>
<p>Here is an example to rotate a page with just a few lines of Python code.</p>
How to delete pages in a PDF using fitz / MuPDF / PyMuPDF (Python)
2016-05-01T09:26:44-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580657-how-to-delete-pages-in-a-pdf-using-fitz-mupdf-pymu/
<p style="color: grey">
Python
recipe 580657
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pdf_generation/">pdf_generation</a>).
</p>
<p>A new method <strong>select()</strong> in PyMuPDF 1.9.0 allows selecting pages of a PDF document to create a new one. Any Python list of integers (0 <= n < page count) can be taken.</p>
<p>The resulting PDF contains all links, annotations and bookmarks (provided they still point to valid targets).</p>
Roll your own Postscript code from scratch (Python)
2015-12-09T23:30:13-08:00Jack Trainorhttp://code.activestate.com/recipes/users/4076953/http://code.activestate.com/recipes/579136-roll-your-own-postscript-code-from-scratch/
<p style="color: grey">
Python
recipe 579136
by <a href="/recipes/users/4076953/">Jack Trainor</a>
(<a href="/recipes/tags/ghostscript/">ghostscript</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/postscript/">postscript</a>, <a href="/recipes/tags/ps/">ps</a>).
</p>
<p>This recipe provides a mini-framework for creating custom Postscript PS and PDF files from scratch. It includes sample code for a personalized business index card.</p>
<p>Recipe does not use any Python PDF libraries. However, Ghostscript and a PDF viewer are useful for displaying/debugging output.</p>
<p>It's easier than you might think to roll your own Postscript code!</p>
Decrypt a PDF using fitz / MuPDF (PyMuPDF) (Python)
2016-03-17T12:22:10-07:00Harald Liederhttp://code.activestate.com/recipes/users/4191581/http://code.activestate.com/recipes/580627-decrypt-a-pdf-using-fitz-mupdf-pymupdf/
<p style="color: grey">
Python
recipe 580627
by <a href="/recipes/users/4191581/">Harald Lieder</a>
(<a href="/recipes/tags/decompression/">decompression</a>, <a href="/recipes/tags/decryption/">decryption</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/repair/">repair</a>).
</p>
<p>It's more a code snippet. Shows how to dynamically check whether a PDF is password protected. If it is, decrypt it and save it back to disk un-encrypted.</p>