Most viewed recipes tagged "pdf"http://code.activestate.com/recipes/tags/pdf/views/2017-07-11T18:57:54-07:00ActiveState Code RecipesInsert a Text Box in a PDF page (fitz / PyMuPDF) (Python)
2017-06-29T22:54:25-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580809-insert-a-text-box-in-a-pdf-page-fitz-pymupdf/
<p style="color: grey">
Python
recipe 580809
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/textbox/">textbox</a>).
</p>
<p>This method inserts text into a predefined rectangular area of a (new or existing) PDF page.
Words are distributed across the available space, put on new lines when required etc. Line breaks and tab characters are respected / resolved.
Text can be aligned in the box (left, center, right) and fonts can be freely chosen.
The method returns a float indicating how vertical space is left over after filling the area.</p>
Create Calendars on PDF with a few lines (Python)
2017-06-13T10:57:34-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580805-create-calendars-on-pdf-with-a-few-lines/
<p style="color: grey">
Python
recipe 580805
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/calendar/">calendar</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>).
Revision 2.
</p>
<p>PyMuPDF (fitz) provides easy to use ways to create PDF documents out of simple texts.</p>
<p>An example is the text output of Python's calendar module. Here we take a starting year as script parameter and output a 3-page (A4 landscape) document with calendars for this and the following two years - in less than 20 lines of code.</p>
Inserting Images on PDF Pages (Python)
2017-05-17T21:10:26-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580803-inserting-images-on-pdf-pages/
<p style="color: grey">
Python
recipe 580803
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>).
</p>
<p>Version 1.11.0 of PyMuPDF allows putting an image on an existing PDF page.
The following example puts the same image on every page of a given PDF - like a thumbnail.</p>
Convert Microsot Excel (XLSX) to PDF with Python and xtopdf (Python)
2015-11-22T22:15:25-08:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/579128-convert-microsot-excel-xlsx-to-pdf-with-python-and/
<p style="color: grey">
Python
recipe 579128
by <a href="/recipes/users/4173351/">Vasudev Ram</a>
(<a href="/recipes/tags/excel/">excel</a>, <a href="/recipes/tags/formats/">formats</a>, <a href="/recipes/tags/openpyxl/">openpyxl</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/xlsx/">xlsx</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>).
</p>
<p>This recipe shows how the basics of to convert the text data in a Microsoft Excel file (XLSX format) to PDF (Portable Document Format). It uses openpyxl to read the XLSX file and xtopdf to generate the PDF file.</p>
How to handle PDF embedded files with PyMuPDF (Python)
2017-07-11T18:57:54-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580796-how-to-handle-pdf-embedded-files-with-pymupdf/
<p style="color: grey">
Python
recipe 580796
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/embedded_files/">embedded_files</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>).
Revision 3.
</p>
<p>Version 1.11.0 (based on MuPDF v1.11) allows exporting, importing and interrogating files embedded in a PDF.</p>
<p>PDF "/EmbeddedFiles" are similar to ZIP archives (or the Microsoft OLE technique), allowing arbitrary data to be incorporated in a PDF and benefit from its unique features.</p>
How to Create a PDF with a Caustic Drawing (Python)
2017-06-18T17:43:47-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580806-how-to-create-a-pdf-with-a-caustic-drawing/
<p style="color: grey">
Python
recipe 580806
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>).
</p>
<p>Just a little demo on how to create simple drawings with PyMuPDF.</p>
<p>This script simulates what you see looking into your coffee mug, early in the morning after a long night of programming ...</p>
Inserting pages into a PDF with PyMuPDF (Python)
2017-05-17T21:15:26-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580802-inserting-pages-into-a-pdf-with-pymupdf/
<p style="color: grey">
Python
recipe 580802
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/text_conversion/">text_conversion</a>).
Revision 2.
</p>
<p>Version 1.11.0 of PyMuPDF allows creating new PDF pages, as well as inserting images into existing pages.</p>
<p>Here is a script that converts any textfile into a PDF.</p>
Convert doc and docx files to pdf (Python)
2014-03-31T18:39:16-07:00Fabian Mayerhttp://code.activestate.com/recipes/users/4189629/http://code.activestate.com/recipes/578858-convert-doc-and-docx-files-to-pdf/
<p style="color: grey">
Python
recipe 578858
by <a href="/recipes/users/4189629/">Fabian Mayer</a>
(<a href="/recipes/tags/doc/">doc</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/win32com/">win32com</a>).
Revision 2.
</p>
<p>The Script converts all doc and docx files in a specified folder to pdf files. It checks whether the provided absolute path does actually exist and whether the specified folder contains any doc and docx files. It does not travers the directory recursively. The script is not portable and runs only a Windows machine. Based on the experience I made, I recommend closing MS Word before running the script.</p>
Convert Microsoft Word files to PDF with DOCXtoPDF (Python)
2013-12-24T23:03:43-08:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/578795-convert-microsoft-word-files-to-pdf-with-docxtopdf/
<p style="color: grey">
Python
recipe 578795
by <a href="/recipes/users/4173351/">Vasudev Ram</a>
(<a href="/recipes/tags/docx/">docx</a>, <a href="/recipes/tags/microsoft/">microsoft</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/word/">word</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>).
</p>
<p>Yhis recipe shows how to convert Microsoft Word .DOCX files to PDF format, using the python-docx library and my xtopdf toolkit for PDF creation.</p>
<p>Note: The recipe has some limitations. E.g. fonts, tables, etc. from the input DOCX file are not preserved in the output PDF file.</p>
PDF Text Extraction using fitz / MuPDF (PyMuPDF) (Python)
2016-03-17T12:00:06-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580626-pdf-text-extraction-using-fitz-mupdf-pymupdf/
<p style="color: grey">
Python
recipe 580626
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/cbz/">cbz</a>, <a href="/recipes/tags/epub/">epub</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/openxps/">openxps</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>, <a href="/recipes/tags/text_extraction/">text_extraction</a>, <a href="/recipes/tags/xps/">xps</a>).
</p>
<p>Extract all the text of a PDF (or other supported container types) at very high speed.
In general, text pieces of a PDF page are not arranged in natural reading order, but in the order they were entered during PDF creation.
This script re-arranges text blocks according to their pixel coordinates to achieve a more readable output, i.e. top-down, left-right.</p>
Crop PDF File with pyPdf (Python)
2011-11-03T17:42:10-07:00ccpizzahttp://code.activestate.com/recipes/users/4170754/http://code.activestate.com/recipes/576837-crop-pdf-file-with-pypdf/
<p style="color: grey">
Python
recipe 576837
by <a href="/recipes/users/4170754/">ccpizza</a>
(<a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pypdf/">pypdf</a>).
Revision 3.
</p>
<p>This recipe was originally posted by <code>sjvr767</code> on <a href="http://www.mobileread.com/forums/showthread.php?t=25565" rel="nofollow">http://www.mobileread.com/forums/showthread.php?t=25565</a> and I decided to also make it available here.</p>
<p>It uses pypdf (<a href="http://pybrary.net/pyPdf/%29" rel="nofollow">http://pybrary.net/pyPdf/)</a></p>
<p>The script is supposed to be run like this:</p>
<p><code>pdf_crop.py" -m "120 50 120 180" -i mypdf.pdf</code></p>
<p>where the margins are <code>left top right bottom</code></p>
<p>To install pyPdf try <code>easy_install pypdf</code>.</p>
Extract images of a PDF - optionally by page using PyMuPDF / fitz (Python)
2016-09-28T12:03:59-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580703-extract-images-of-a-pdf-optionally-by-page-using-p/
<p style="color: grey">
Python
recipe 580703
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/png/">png</a>).
</p>
<p>Two small scripts to extract images contained in a PDF document as PNG files.
(1) Script 1 extracts <strong>all</strong> images
(2) Script 2 extracts only images that are referenced by a page</p>
Improved ReportLab recipe for "page x of y" (Python)
2009-07-06T10:03:28-07:00Vinay Sajiphttp://code.activestate.com/recipes/users/4034162/http://code.activestate.com/recipes/576832-improved-reportlab-recipe-for-page-x-of-y/
<p style="color: grey">
Python
recipe 576832
by <a href="/recipes/users/4034162/">Vinay Sajip</a>
(<a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/reportlab/">reportlab</a>).
Revision 2.
</p>
<p>This recipe is based on <a href="http://code.activestate.com/recipes/546511/"><a href="http://code.activestate.com/recipes/546511/">Recipe 546511</a></a> which does not work reliably if there are images in the content.</p>
Convert HTML text to PDF with Beautiful Soup and xtopdf (Python)
2015-01-28T22:20:53-08:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/579014-convert-html-text-to-pdf-with-beautiful-soup-and-x/
<p style="color: grey">
Python
recipe 579014
by <a href="/recipes/users/4173351/">Vasudev Ram</a>
(<a href="/recipes/tags/file/">file</a>, <a href="/recipes/tags/format/">format</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pdfwriter/">pdfwriter</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/reportgeneration/">reportgeneration</a>, <a href="/recipes/tags/reporting/">reporting</a>, <a href="/recipes/tags/reportlab/">reportlab</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>).
</p>
<p>This recipe shows how to convert the text in an HTML document to PDF. It uses the Beautiful Soup and xtopdf Python libraries. Beautiful Soup is a library for HTML parsing and content extraction. xtopdf is a library for PDF creation from other formats, including text and many others.</p>
wxPython PDF Viewer using Poppler (Python)
2010-04-15T17:43:27-07:00Marcelo Fernándezhttp://code.activestate.com/recipes/users/4173551/http://code.activestate.com/recipes/577195-wxpython-pdf-viewer-using-poppler/
<p style="color: grey">
Python
recipe 577195
by <a href="/recipes/users/4173551/">Marcelo Fernández</a>
(<a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/poppler/">poppler</a>, <a href="/recipes/tags/python_poppler/">python_poppler</a>, <a href="/recipes/tags/viewer/">viewer</a>, <a href="/recipes/tags/wxpython/">wxpython</a>).
</p>
<p>This example shows a PDF Viewer class, which handles things like Zoom and Scrolling. It requires python-poppler and wxPython >= 2.8.9.</p>
Generate a PDF invoice with xtopdf and Python (Python)
2015-10-07T18:02:46-07:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/579108-generate-a-pdf-invoice-with-xtopdf-and-python/
<p style="color: grey">
Python
recipe 579108
by <a href="/recipes/users/4173351/">Vasudev Ram</a>
(<a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pdfwriter/">pdfwriter</a>, <a href="/recipes/tags/pdf_generation/">pdf_generation</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/python2/">python2</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>).
</p>
<p>This recipe shows the basics of how to generate invoices as PDF documents, using xtopdf, a Python toolkit for PDF creation. An example of creating a simple invoice through Python and xtopdf code is shown. </p>
<p>xtopdf is available on Bitbucket at:</p>
<p><a href="https://bitbucket.org/vasudevram/xtopdf" rel="nofollow">https://bitbucket.org/vasudevram/xtopdf</a></p>
<p>and you can find many links with examples of using xtopdf, how to install it, etc., from this Google search:</p>
<p><a href="https://google.com/search?q=xtopdf" rel="nofollow">https://google.com/search?q=xtopdf</a></p>
<p>This sub-feed of my blog:</p>
<p><a href="http://jugad2.blogspot.com/search/label/xtopdf" rel="nofollow">http://jugad2.blogspot.com/search/label/xtopdf</a></p>
<p>gives access to all my blog posts labelled xtopdf.</p>
How to parse a table in a PDF document (Python)
2016-04-10T22:43:57-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580635-how-to-parse-a-table-in-a-pdf-document/
<p style="color: grey">
Python
recipe 580635
by <a href="/recipes/users/4193772/">Jorj X. McKie</a>
(<a href="/recipes/tags/cbz/">cbz</a>, <a href="/recipes/tags/epub/">epub</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/openxps/">openxps</a>, <a href="/recipes/tags/parsing/">parsing</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>, <a href="/recipes/tags/table/">table</a>, <a href="/recipes/tags/xps/">xps</a>).
Revision 4.
</p>
<p>A Python function that converts a table contained in a page of a PDF (or OpenXPS, EPUB, CBZ, XPS) document to a matrix-like Python object (list of lists of strings).</p>
PDF a Directory of Images using Reportlab (Python)
2009-04-12T08:35:10-07:00andrew.canithttp://code.activestate.com/recipes/users/4169843/http://code.activestate.com/recipes/576717-pdf-a-directory-of-images-using-reportlab/
<p style="color: grey">
Python
recipe 576717
by <a href="/recipes/users/4169843/">andrew.canit</a>
(<a href="/recipes/tags/directory/">directory</a>, <a href="/recipes/tags/images/">images</a>, <a href="/recipes/tags/pdf/">pdf</a>).
</p>
<p>Walk through a directory PDFing Images</p>
Convert PDF to plain text (Python)
2010-11-25T15:30:52-08:00ccpizzahttp://code.activestate.com/recipes/users/4170754/http://code.activestate.com/recipes/577095-convert-pdf-to-plain-text/
<p style="color: grey">
Python
recipe 577095
by <a href="/recipes/users/4170754/">ccpizza</a>
(<a href="/recipes/tags/converter/">converter</a>, <a href="/recipes/tags/pdf/">pdf</a>).
</p>
<p>This is a very raw PDF converter which has absolutely no idea of the page layout or text positioning.</p>
<p>To install the required module try <code>easy_install pypdf</code> in a console.</p>
Convert Excel to PDF with xlwings and xtopdf (Python)
2015-02-22T10:42:18-08:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/579026-convert-excel-to-pdf-with-xlwings-and-xtopdf/
<p style="color: grey">
Python
recipe 579026
by <a href="/recipes/users/4173351/">Vasudev Ram</a>
(<a href="/recipes/tags/excel/">excel</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pdfwriter/">pdfwriter</a>, <a href="/recipes/tags/reportlab/">reportlab</a>, <a href="/recipes/tags/xlwings/">xlwings</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>).
</p>
<p>This recipe shows how to get the text content from an Excel file and convert it to PDF, using the xlwings and xtopdf Python libraries. It also shows how to create an Excel file programmatically using xlwings.</p>