Most viewed recipes tagged "pdf" but not "files"http://code.activestate.com/recipes/tags/pdf-files/views/2017-07-11T18:57:54-07:00ActiveState Code RecipesInsert a Text Box in a PDF page (fitz / PyMuPDF) (Python) 2017-06-29T22:54:25-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580809-insert-a-text-box-in-a-pdf-page-fitz-pymupdf/ <p style="color: grey"> Python recipe 580809 by <a href="/recipes/users/4193772/">Jorj X. McKie</a> (<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/textbox/">textbox</a>). </p> <p>This method inserts text into a predefined rectangular area of a (new or existing) PDF page. Words are distributed across the available space, put on new lines when required etc. Line breaks and tab characters are respected / resolved. Text can be aligned in the box (left, center, right) and fonts can be freely chosen. The method returns a float indicating how vertical space is left over after filling the area.</p> Create Calendars on PDF with a few lines (Python) 2017-06-13T10:57:34-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580805-create-calendars-on-pdf-with-a-few-lines/ <p style="color: grey"> Python recipe 580805 by <a href="/recipes/users/4193772/">Jorj X. McKie</a> (<a href="/recipes/tags/calendar/">calendar</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>). Revision 2. </p> <p>PyMuPDF (fitz) provides easy to use ways to create PDF documents out of simple texts.</p> <p>An example is the text output of Python's calendar module. Here we take a starting year as script parameter and output a 3-page (A4 landscape) document with calendars for this and the following two years - in less than 20 lines of code.</p> Inserting Images on PDF Pages (Python) 2017-05-17T21:10:26-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580803-inserting-images-on-pdf-pages/ <p style="color: grey"> Python recipe 580803 by <a href="/recipes/users/4193772/">Jorj X. McKie</a> (<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>). </p> <p>Version 1.11.0 of PyMuPDF allows putting an image on an existing PDF page. The following example puts the same image on every page of a given PDF - like a thumbnail.</p> Convert Microsot Excel (XLSX) to PDF with Python and xtopdf (Python) 2015-11-22T22:15:25-08:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/579128-convert-microsot-excel-xlsx-to-pdf-with-python-and/ <p style="color: grey"> Python recipe 579128 by <a href="/recipes/users/4173351/">Vasudev Ram</a> (<a href="/recipes/tags/excel/">excel</a>, <a href="/recipes/tags/formats/">formats</a>, <a href="/recipes/tags/openpyxl/">openpyxl</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/xlsx/">xlsx</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>). </p> <p>This recipe shows how the basics of to convert the text data in a Microsoft Excel file (XLSX format) to PDF (Portable Document Format). It uses openpyxl to read the XLSX file and xtopdf to generate the PDF file.</p> How to handle PDF embedded files with PyMuPDF (Python) 2017-07-11T18:57:54-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580796-how-to-handle-pdf-embedded-files-with-pymupdf/ <p style="color: grey"> Python recipe 580796 by <a href="/recipes/users/4193772/">Jorj X. McKie</a> (<a href="/recipes/tags/embedded_files/">embedded_files</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>). Revision 3. </p> <p>Version 1.11.0 (based on MuPDF v1.11) allows exporting, importing and interrogating files embedded in a PDF.</p> <p>PDF "/EmbeddedFiles" are similar to ZIP archives (or the Microsoft OLE technique), allowing arbitrary data to be incorporated in a PDF and benefit from its unique features.</p> How to Create a PDF with a Caustic Drawing (Python) 2017-06-18T17:43:47-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580806-how-to-create-a-pdf-with-a-caustic-drawing/ <p style="color: grey"> Python recipe 580806 by <a href="/recipes/users/4193772/">Jorj X. McKie</a> (<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>). </p> <p>Just a little demo on how to create simple drawings with PyMuPDF.</p> <p>This script simulates what you see looking into your coffee mug, early in the morning after a long night of programming ...</p> Inserting pages into a PDF with PyMuPDF (Python) 2017-05-17T21:15:26-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580802-inserting-pages-into-a-pdf-with-pymupdf/ <p style="color: grey"> Python recipe 580802 by <a href="/recipes/users/4193772/">Jorj X. McKie</a> (<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/text_conversion/">text_conversion</a>). Revision 2. </p> <p>Version 1.11.0 of PyMuPDF allows creating new PDF pages, as well as inserting images into existing pages.</p> <p>Here is a script that converts any textfile into a PDF.</p> Convert doc and docx files to pdf (Python) 2014-03-31T18:39:16-07:00Fabian Mayerhttp://code.activestate.com/recipes/users/4189629/http://code.activestate.com/recipes/578858-convert-doc-and-docx-files-to-pdf/ <p style="color: grey"> Python recipe 578858 by <a href="/recipes/users/4189629/">Fabian Mayer</a> (<a href="/recipes/tags/doc/">doc</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/win32com/">win32com</a>). Revision 2. </p> <p>The Script converts all doc and docx files in a specified folder to pdf files. It checks whether the provided absolute path does actually exist and whether the specified folder contains any doc and docx files. It does not travers the directory recursively. The script is not portable and runs only a Windows machine. Based on the experience I made, I recommend closing MS Word before running the script.</p> Convert Microsoft Word files to PDF with DOCXtoPDF (Python) 2013-12-24T23:03:43-08:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/578795-convert-microsoft-word-files-to-pdf-with-docxtopdf/ <p style="color: grey"> Python recipe 578795 by <a href="/recipes/users/4173351/">Vasudev Ram</a> (<a href="/recipes/tags/docx/">docx</a>, <a href="/recipes/tags/microsoft/">microsoft</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/word/">word</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>). </p> <p>Yhis recipe shows how to convert Microsoft Word .DOCX files to PDF format, using the python-docx library and my xtopdf toolkit for PDF creation.</p> <p>Note: The recipe has some limitations. E.g. fonts, tables, etc. from the input DOCX file are not preserved in the output PDF file.</p> PDF Text Extraction using fitz / MuPDF (PyMuPDF) (Python) 2016-03-17T12:00:06-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580626-pdf-text-extraction-using-fitz-mupdf-pymupdf/ <p style="color: grey"> Python recipe 580626 by <a href="/recipes/users/4193772/">Jorj X. McKie</a> (<a href="/recipes/tags/cbz/">cbz</a>, <a href="/recipes/tags/epub/">epub</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/openxps/">openxps</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>, <a href="/recipes/tags/text_extraction/">text_extraction</a>, <a href="/recipes/tags/xps/">xps</a>). </p> <p>Extract all the text of a PDF (or other supported container types) at very high speed. In general, text pieces of a PDF page are not arranged in natural reading order, but in the order they were entered during PDF creation. This script re-arranges text blocks according to their pixel coordinates to achieve a more readable output, i.e. top-down, left-right.</p> Crop PDF File with pyPdf (Python) 2011-11-03T17:42:10-07:00ccpizzahttp://code.activestate.com/recipes/users/4170754/http://code.activestate.com/recipes/576837-crop-pdf-file-with-pypdf/ <p style="color: grey"> Python recipe 576837 by <a href="/recipes/users/4170754/">ccpizza</a> (<a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pypdf/">pypdf</a>). Revision 3. </p> <p>This recipe was originally posted by <code>sjvr767</code> on <a href="http://www.mobileread.com/forums/showthread.php?t=25565" rel="nofollow">http://www.mobileread.com/forums/showthread.php?t=25565</a> and I decided to also make it available here.</p> <p>It uses pypdf (<a href="http://pybrary.net/pyPdf/%29" rel="nofollow">http://pybrary.net/pyPdf/)</a></p> <p>The script is supposed to be run like this:</p> <p><code>pdf_crop.py" -m "120 50 120 180" -i mypdf.pdf</code></p> <p>where the margins are <code>left top right bottom</code></p> <p>To install pyPdf try <code>easy_install pypdf</code>.</p> Extract images of a PDF - optionally by page using PyMuPDF / fitz (Python) 2016-09-28T12:03:59-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580703-extract-images-of-a-pdf-optionally-by-page-using-p/ <p style="color: grey"> Python recipe 580703 by <a href="/recipes/users/4193772/">Jorj X. McKie</a> (<a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/png/">png</a>). </p> <p>Two small scripts to extract images contained in a PDF document as PNG files. (1) Script 1 extracts <strong>all</strong> images (2) Script 2 extracts only images that are referenced by a page</p> Improved ReportLab recipe for "page x of y" (Python) 2009-07-06T10:03:28-07:00Vinay Sajiphttp://code.activestate.com/recipes/users/4034162/http://code.activestate.com/recipes/576832-improved-reportlab-recipe-for-page-x-of-y/ <p style="color: grey"> Python recipe 576832 by <a href="/recipes/users/4034162/">Vinay Sajip</a> (<a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/reportlab/">reportlab</a>). Revision 2. </p> <p>This recipe is based on <a href="http://code.activestate.com/recipes/546511/"><a href="http://code.activestate.com/recipes/546511/">Recipe 546511</a></a> which does not work reliably if there are images in the content.</p> Convert HTML text to PDF with Beautiful Soup and xtopdf (Python) 2015-01-28T22:20:53-08:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/579014-convert-html-text-to-pdf-with-beautiful-soup-and-x/ <p style="color: grey"> Python recipe 579014 by <a href="/recipes/users/4173351/">Vasudev Ram</a> (<a href="/recipes/tags/file/">file</a>, <a href="/recipes/tags/format/">format</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pdfwriter/">pdfwriter</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/reportgeneration/">reportgeneration</a>, <a href="/recipes/tags/reporting/">reporting</a>, <a href="/recipes/tags/reportlab/">reportlab</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>). </p> <p>This recipe shows how to convert the text in an HTML document to PDF. It uses the Beautiful Soup and xtopdf Python libraries. Beautiful Soup is a library for HTML parsing and content extraction. xtopdf is a library for PDF creation from other formats, including text and many others.</p> wxPython PDF Viewer using Poppler (Python) 2010-04-15T17:43:27-07:00Marcelo Fernándezhttp://code.activestate.com/recipes/users/4173551/http://code.activestate.com/recipes/577195-wxpython-pdf-viewer-using-poppler/ <p style="color: grey"> Python recipe 577195 by <a href="/recipes/users/4173551/">Marcelo Fernández</a> (<a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/poppler/">poppler</a>, <a href="/recipes/tags/python_poppler/">python_poppler</a>, <a href="/recipes/tags/viewer/">viewer</a>, <a href="/recipes/tags/wxpython/">wxpython</a>). </p> <p>This example shows a PDF Viewer class, which handles things like Zoom and Scrolling. It requires python-poppler and wxPython &gt;= 2.8.9.</p> Generate a PDF invoice with xtopdf and Python (Python) 2015-10-07T18:02:46-07:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/579108-generate-a-pdf-invoice-with-xtopdf-and-python/ <p style="color: grey"> Python recipe 579108 by <a href="/recipes/users/4173351/">Vasudev Ram</a> (<a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pdfwriter/">pdfwriter</a>, <a href="/recipes/tags/pdf_generation/">pdf_generation</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/python2/">python2</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>). </p> <p>This recipe shows the basics of how to generate invoices as PDF documents, using xtopdf, a Python toolkit for PDF creation. An example of creating a simple invoice through Python and xtopdf code is shown. </p> <p>xtopdf is available on Bitbucket at:</p> <p><a href="https://bitbucket.org/vasudevram/xtopdf" rel="nofollow">https://bitbucket.org/vasudevram/xtopdf</a></p> <p>and you can find many links with examples of using xtopdf, how to install it, etc., from this Google search:</p> <p><a href="https://google.com/search?q=xtopdf" rel="nofollow">https://google.com/search?q=xtopdf</a></p> <p>This sub-feed of my blog:</p> <p><a href="http://jugad2.blogspot.com/search/label/xtopdf" rel="nofollow">http://jugad2.blogspot.com/search/label/xtopdf</a></p> <p>gives access to all my blog posts labelled xtopdf.</p> How to parse a table in a PDF document (Python) 2016-04-10T22:43:57-07:00Jorj X. McKiehttp://code.activestate.com/recipes/users/4193772/http://code.activestate.com/recipes/580635-how-to-parse-a-table-in-a-pdf-document/ <p style="color: grey"> Python recipe 580635 by <a href="/recipes/users/4193772/">Jorj X. McKie</a> (<a href="/recipes/tags/cbz/">cbz</a>, <a href="/recipes/tags/epub/">epub</a>, <a href="/recipes/tags/fitz/">fitz</a>, <a href="/recipes/tags/mupdf/">mupdf</a>, <a href="/recipes/tags/openxps/">openxps</a>, <a href="/recipes/tags/parsing/">parsing</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pymupdf/">pymupdf</a>, <a href="/recipes/tags/table/">table</a>, <a href="/recipes/tags/xps/">xps</a>). Revision 4. </p> <p>A Python function that converts a table contained in a page of a PDF (or OpenXPS, EPUB, CBZ, XPS) document to a matrix-like Python object (list of lists of strings).</p> PDF a Directory of Images using Reportlab (Python) 2009-04-12T08:35:10-07:00andrew.canithttp://code.activestate.com/recipes/users/4169843/http://code.activestate.com/recipes/576717-pdf-a-directory-of-images-using-reportlab/ <p style="color: grey"> Python recipe 576717 by <a href="/recipes/users/4169843/">andrew.canit</a> (<a href="/recipes/tags/directory/">directory</a>, <a href="/recipes/tags/images/">images</a>, <a href="/recipes/tags/pdf/">pdf</a>). </p> <p>Walk through a directory PDFing Images</p> Convert PDF to plain text (Python) 2010-11-25T15:30:52-08:00ccpizzahttp://code.activestate.com/recipes/users/4170754/http://code.activestate.com/recipes/577095-convert-pdf-to-plain-text/ <p style="color: grey"> Python recipe 577095 by <a href="/recipes/users/4170754/">ccpizza</a> (<a href="/recipes/tags/converter/">converter</a>, <a href="/recipes/tags/pdf/">pdf</a>). </p> <p>This is a very raw PDF converter which has absolutely no idea of the page layout or text positioning.</p> <p>To install the required module try <code>easy_install pypdf</code> in a console.</p> Convert Excel to PDF with xlwings and xtopdf (Python) 2015-02-22T10:42:18-08:00Vasudev Ramhttp://code.activestate.com/recipes/users/4173351/http://code.activestate.com/recipes/579026-convert-excel-to-pdf-with-xlwings-and-xtopdf/ <p style="color: grey"> Python recipe 579026 by <a href="/recipes/users/4173351/">Vasudev Ram</a> (<a href="/recipes/tags/excel/">excel</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pdfwriter/">pdfwriter</a>, <a href="/recipes/tags/reportlab/">reportlab</a>, <a href="/recipes/tags/xlwings/">xlwings</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>). </p> <p>This recipe shows how to get the text content from an Excel file and convert it to PDF, using the xlwings and xtopdf Python libraries. It also shows how to create an Excel file programmatically using xlwings.</p>