Popular recipes tagged "pdf"http://code.activestate.com/recipes/tags/pdf/2017-07-11T18:57:54-07:00ActiveState Code RecipesInsert a Text Box in a PDF page (fitz / PyMuPDF) (Python)
2017-06-29T22:54:25-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580809-insert-a-text-box-in-a-pdf-page-fitz-pymupdf/
<p style="color: grey">
Python
recipe 580809
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/textbox/">textbox</a>).
</p>
<p>This method inserts text into a predefined rectangular area of a (new or existing) PDF page.
Words are distributed across the available space, put on new lines when required etc. Line breaks and tab characters are respected / resolved.
Text can be aligned in the box (left, center, right) and fonts can be freely chosen.
The method returns a float indicating how vertical space is left over after filling the area.</p>
Inserting Images on PDF Pages (Python)
2017-05-17T21:10:26-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580803-inserting-images-on-pdf-pages/
<p style="color: grey">
Python
recipe 580803
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>).
</p>
<p>Version 1.11.0 of PyMuPDF allows putting an image on an existing PDF page.
The following example puts the same image on every page of a given PDF - like a thumbnail.</p>
Convert Microsot Excel (XLSX) to PDF with Python and xtopdf (Python)
2015-11-22T22:15:25-08:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/579128-convert-microsot-excel-xlsx-to-pdf-with-python-and/
<p style="color: grey">
Python
recipe 579128
by <a href="/recipes/users/4173351/">Vasudev Ram</a>
(<a href="/recipes/tags/excel/">excel</a>, <a href="/recipes/tags/formats/">formats</a>, <a href="/recipes/tags/openpyxl/">openpyxl</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/xlsx/">xlsx</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>).
</p>
<p>This recipe shows how the basics of to convert the text data in a Microsoft Excel file (XLSX format) to PDF (Portable Document Format). It uses openpyxl to read the XLSX file and xtopdf to generate the PDF file.</p>
Create Calendars on PDF with a few lines (Python)
2017-06-13T10:57:34-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580805-create-calendars-on-pdf-with-a-few-lines/
<p style="color: grey">
Python
recipe 580805
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/calendar/">calendar</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>).
Revision 2.
</p>
<p>PyMuPDF (fitz) provides easy to use ways to create PDF documents out of simple texts.</p>
<p>An example is the text output of Python's calendar module. Here we take a starting year as script parameter and output a 3-page (A4 landscape) document with calendars for this and the following two years - in less than 20 lines of code.</p>
How to handle PDF embedded files with PyMuPDF (Python)
2017-07-11T18:57:54-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580796-how-to-handle-pdf-embedded-files-with-pymupdf/
<p style="color: grey">
Python
recipe 580796
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/embedded_files/">embedded_files</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>).
Revision 3.
</p>
<p>Version 1.11.0 (based on MuPDF v1.11) allows exporting, importing and interrogating files embedded in a PDF.</p>
<p>PDF "/EmbeddedFiles" are similar to ZIP archives (or the Microsoft OLE technique), allowing arbitrary data to be incorporated in a PDF and benefit from its unique features.</p>
Inserting pages into a PDF with PyMuPDF (Python)
2017-05-17T21:15:26-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580802-inserting-pages-into-a-pdf-with-pymupdf/
<p style="color: grey">
Python
recipe 580802
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/text_conversion/">text_conversion</a>).
Revision 2.
</p>
<p>Version 1.11.0 of PyMuPDF allows creating new PDF pages, as well as inserting images into existing pages.</p>
<p>Here is a script that converts any textfile into a PDF.</p>
PDF Text Extraction using fitz / MuPDF (PyMuPDF) (Python)
2016-03-17T12:00:06-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580626-pdf-text-extraction-using-fitz-mupdf-pymupdf/
<p style="color: grey">
Python
recipe 580626
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/cbz/">cbz</a>, <a href="/recipes/tags/epub/">epub</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/openxps/">openxps</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>, <a href="/recipes/tags/text_extraction/">text_extraction</a>, <a href="/recipes/tags/xps/">xps</a>).
</p>
<p>Extract all the text of a PDF (or other supported container types) at very high speed.
In general, text pieces of a PDF page are not arranged in natural reading order, but in the order they were entered during PDF creation.
This script re-arranges text blocks according to their pixel coordinates to achieve a more readable output, i.e. top-down, left-right.</p>
How to Create a PDF with a Caustic Drawing (Python)
2017-06-18T17:43:47-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580806-how-to-create-a-pdf-with-a-caustic-drawing/
<p style="color: grey">
Python
recipe 580806
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>).
</p>
<p>Just a little demo on how to create simple drawings with PyMuPDF.</p>
<p>This script simulates what you see looking into your coffee mug, early in the morning after a long night of programming ...</p>
How to Maintain PDF Links with fitz / PyMuPDF (Python)
2017-03-22T13:12:25-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580765-how-to-maintain-pdf-links-with-fitz-pymupdf/
<p style="color: grey">
Python
recipe 580765
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/link/">link</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>).
</p>
<p>This REPL script example displays, updates, inserts and deletes links on a PDF page.</p>
[xtopdf] Publish Delimiter-Separated Values (DSV data) to PDF (Python)
2016-12-17T19:08:33-08:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/580736-xtopdf-publish-delimiter-separated-values-dsv-data/
<p style="color: grey">
Python
recipe 580736
by <a href="/recipes/users/4173351/">Vasudev Ram</a>
(<a href="/recipes/tags/commandline/">commandline</a>, <a href="/recipes/tags/csv/">csv</a>, <a href="/recipes/tags/data/">data</a>, <a href="/recipes/tags/files/">files</a>, <a href="/recipes/tags/formats/">formats</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pdf_generation/">pdf_generation</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/tsv/">tsv</a>, <a href="/recipes/tags/utilities/">utilities</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>).
</p>
<p>This recipe shows how to publish delimiter-separated values (a commonly used tabular data format) to PDF, using the xtopdf toolkit for PDF creation. It lets the user specify the delimiter via one of two command-line options - an ASCII code or an ASCII character. As Unix filters tend to do, it can operate either on standard input or on input filenames given as command-line arguments. In the case of multiple inputs via files, each input goes to a separate PDF output file.</p>
Convert wildcard text files to PDF with xtopdf (e.g. report*.txt) (Python)
2016-12-06T20:37:30-08:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/580727-convert-wildcard-text-files-to-pdf-with-xtopdf-eg-/
<p style="color: grey">
Python
recipe 580727
by <a href="/recipes/users/4173351/">Vasudev Ram</a>
(<a href="/recipes/tags/conversion/">conversion</a>, <a href="/recipes/tags/files/">files</a>, <a href="/recipes/tags/globbing/">globbing</a>, <a href="/recipes/tags/patterns/">patterns</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pdfwriter/">pdfwriter</a>, <a href="/recipes/tags/pdf_generation/">pdf_generation</a>, <a href="/recipes/tags/text_processing/">text_processing</a>, <a href="/recipes/tags/wildcard/">wildcard</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>).
</p>
<p>This recipe shows how to convert all text files matching a filename wildcard to PDF, using the xtopdf PDF creation toolkit. For example, if you specify report<em>.txt as the wildcard, all files in the current directory that match report</em>.txt, will be converted to PDF, each in a separate PDF file. The original text files are not changed.</p>
<p>Here is a guide to installing and using xtopdf:</p>
<p><a href="http://jugad2.blogspot.in/2012/07/guide-to-installing-and-using-xtopdf.html" rel="nofollow">http://jugad2.blogspot.in/2012/07/guide-to-installing-and-using-xtopdf.html</a></p>
<p>More details on running the program, and sample output, are available here:</p>
<p><a href="http://jugad2.blogspot.in/2016/12/xtopdf-wildcard-text-files-to-pdf-with.html" rel="nofollow">http://jugad2.blogspot.in/2016/12/xtopdf-wildcard-text-files-to-pdf-with.html</a></p>
Batch conversion of text files to PDF with fileinput and xtopdf (Python)
2016-11-07T20:28:01-08:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/580715-batch-conversion-of-text-files-to-pdf-with-fileinp/
<p style="color: grey">
Python
recipe 580715
by <a href="/recipes/users/4173351/">Vasudev Ram</a>
(<a href="/recipes/tags/batch/">batch</a>, <a href="/recipes/tags/batchmode/">batchmode</a>, <a href="/recipes/tags/conversion/">conversion</a>, <a href="/recipes/tags/files/">files</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pdfwriter/">pdfwriter</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/text/">text</a>, <a href="/recipes/tags/text_processing/">text_processing</a>, <a href="/recipes/tags/utilities/">utilities</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>).
</p>
<p>This recipe shows how to do a batch conversion of the content of multiple text files into a single PDF file, with a) an automatic page break after the content of each text file (in the PDF output), b) page numbering, and c) a header and footer on each page.</p>
<p>It uses the fileinput module (part of the Python standard library), and xtopdf, a Python library for conversion of other formats to PDF.</p>
<p>xtopdf is available here: <a href="https://bitbucket.org/vasudevram/xtopdf" rel="nofollow">https://bitbucket.org/vasudevram/xtopdf</a></p>
<p>and a guide to installing and using xtopdf is here:</p>
<p><a href="http://jugad2.blogspot.in/2012/07/guide-to-installing-and-using-xtopdf.html" rel="nofollow">http://jugad2.blogspot.in/2012/07/guide-to-installing-and-using-xtopdf.html</a></p>
<p>Here is a sample run of the program:</p>
<p>python BTTP123.pdf text1.txt text2.txt text3.txt</p>
<p>This will read the content from the three text files specified and write it into the PDF file specified, neatly formatted.</p>
Rotate a PDF page in 3 lines (Python)
2016-11-06T11:33:59-08:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580713-rotate-a-pdf-page-in-3-lines/
<p style="color: grey">
Python
recipe 580713
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>).
Revision 2.
</p>
<p>PyMuPDF v1.9.3 now supports several new features for manipulating PDFs.</p>
<p>Here is an example to rotate a page with just a few lines of Python code.</p>
How to delete pages in a PDF using fitz / MuPDF / PyMuPDF (Python)
2016-05-01T09:26:44-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580657-how-to-delete-pages-in-a-pdf-using-fitz-mupdf-pymu/
<p style="color: grey">
Python
recipe 580657
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pdf_generation/">pdf_generation</a>).
</p>
<p>A new method <strong>select()</strong> in PyMuPDF 1.9.0 allows selecting pages of a PDF document to create a new one. Any Python list of integers (0 <= n < page count) can be taken.</p>
<p>The resulting PDF contains all links, annotations and bookmarks (provided they still point to valid targets).</p>
Extract images of a PDF - optionally by page using PyMuPDF / fitz (Python)
2016-09-28T12:03:59-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580703-extract-images-of-a-pdf-optionally-by-page-using-p/
<p style="color: grey">
Python
recipe 580703
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/png/">png</a>).
</p>
<p>Two small scripts to extract images contained in a PDF document as PNG files.
(1) Script 1 extracts <strong>all</strong> images
(2) Script 2 extracts only images that are referenced by a page</p>
Read CSV with D and write it to PDF with Python (Python)
2016-10-26T17:49:00-07:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/580710-read-csv-with-d-and-write-it-to-pdf-with-python/
<p style="color: grey">
Python
recipe 580710
by <a href="/recipes/users/4173351/">Vasudev Ram</a>
(<a href="/recipes/tags/conversion/">conversion</a>, <a href="/recipes/tags/csv/">csv</a>, <a href="/recipes/tags/data/">data</a>, <a href="/recipes/tags/files/">files</a>, <a href="/recipes/tags/formats/">formats</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pdf_generation/">pdf_generation</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>).
</p>
<p>This recipe shows how to read data from a CSV file with a D program and write that data to a PDF file with a Python program - all in a single command-line invocation (after writing the individual programs, of course).</p>
<p>It requires the xtopdf toolkit, which you can get from:</p>
<p><a href="https://bitbucket.org/vasudevram/xtopdf" rel="nofollow">https://bitbucket.org/vasudevram/xtopdf</a></p>
<p>Instructions for installing xtopdf:</p>
<p><a href="http://jugad2.blogspot.in/2012/07/guide-to-installing-and-using-xtopdf.html" rel="nofollow">http://jugad2.blogspot.in/2012/07/guide-to-installing-and-using-xtopdf.html</a></p>
<p>xtopdf in turn requires the open source version of the ReportLab toolkit, which you can get from:</p>
<p><a href="http://www.reportlab.com/ftp" rel="nofollow">http://www.reportlab.com/ftp</a> (<a href="http://www.reportlab.com/ftp/reportlab-1.21.1.tar.gz%29" rel="nofollow">http://www.reportlab.com/ftp/reportlab-1.21.1.tar.gz)</a></p>
<p>It also requires the DMD compiler to compile the D program - this was the version used:</p>
<p>DMD32 D Compiler v2.071.2</p>
CSV export / import of PDF bookmarks (table of contents) (Python)
2017-01-07T12:21:39-08:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580743-csv-export-import-of-pdf-bookmarks-table-of-conten/
<p style="color: grey">
Python
recipe 580743
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/bookmarks/">bookmarks</a>, <a href="/recipes/tags/csv/">csv</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>).
</p>
<p>Two little utilities to export or import a PDF's table of contents from / to a standard CSV file.
Typical usecase would be:</p>
<ol>
<li>export TOC to CSV file</li>
<li>edit CSV file</li>
<li>import TOC from CSV file</li>
</ol>
How to parse a table in a PDF document (Python)
2016-04-10T22:43:57-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580635-how-to-parse-a-table-in-a-pdf-document/
<p style="color: grey">
Python
recipe 580635
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/cbz/">cbz</a>, <a href="/recipes/tags/epub/">epub</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/openxps/">openxps</a>, <a href="/recipes/tags/parsing/">parsing</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>, <a href="/recipes/tags/table/">table</a>, <a href="/recipes/tags/xps/">xps</a>).
Revision 4.
</p>
<p>A Python function that converts a table contained in a page of a PDF (or OpenXPS, EPUB, CBZ, XPS) document to a matrix-like Python object (list of lists of strings).</p>
Access PDF annotations (Python)
2016-12-13T11:06:14-08:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580732-access-pdf-annotations/
<p style="color: grey">
Python
recipe 580732
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/annotation/">annotation</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>).
</p>
<p>Version 1.10.0 of PyMuPDF supports PDF annotations. Among other things they can be extracted as images and also updated to some extent.</p>
Reverse the sequence of annotations on a PDF page (Python)
2017-01-22T14:02:16-08:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580733-reverse-the-sequence-of-annotations-on-a-pdf-page/
<p style="color: grey">
Python
recipe 580733
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/annotation/">annotation</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>).
Revision 2.
</p>
<p>Just another demonstration of PyMuPDF's features to deal with annotations:</p>
<p>Take a page with several annotations and let them change places in reverse order: first and last annot exchange their rectangles, second and second to last, etc.</p>
<p>The annotation images are enlarged or compressed as required to fit into their new areas.</p>