Most viewed recipes tagged "text_processing" Code RecipesPlain Text Editor in Python (Python) 2013-06-18T15:33:01-07:00Captain DeadBones <p style="color: grey"> Python recipe 578568 by <a href="/recipes/users/4184772/">Captain DeadBones</a> (<a href="/recipes/tags/editor/">editor</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/text/">text</a>, <a href="/recipes/tags/text_processing/">text_processing</a>). </p> <p>Just a simple text editor written in Python with Tk for graphics. </p> <p>Check out my blog <a href="">Captain DeadBones Chronicles</a></p> Remove diatrical marks (including accents) from strings using latin alphabets (Python) 2009-02-11T11:40:55-08:00Sylvain Fourmanoit <p style="color: grey"> Python recipe 576648 by <a href="/recipes/users/4150902/">Sylvain Fourmanoit</a> (<a href="/recipes/tags/text/">text</a>, <a href="/recipes/tags/text_processing/">text_processing</a>). Revision 7. </p> <p>Many written languages using latin alphabets employ <a href="">diacritical marks</a>. Even today, it is still pretty common to encounter situations where it would be desirable to get rid of them: files naming, creation of easy to read URIs, indexing schemes, etc. </p> <p>An easy way has always been to simply filter out any "decorated characters"; unfortunately, this does not preserve the base, undecorated glyphs. But thanks to Unicode support in Python, it is now straightforward to perform such a transliteration.</p> <p>(This recipe was completely rewritten based on a comment by Mathieu Clabaut: many thanks to him!)</p> Split a string on capitalized / uppercase char using Python (Python) 2009-12-11T23:16:36-08:00activestate <p style="color: grey"> Python recipe 576984 by <a href="/recipes/users/4172588/">activestate</a> (<a href="/recipes/tags/string/">string</a>, <a href="/recipes/tags/string_parsing/">string_parsing</a>, <a href="/recipes/tags/text_processing/">text_processing</a>). Revision 6. </p> <p>By user <a href="" rel="nofollow"></a> in comment on <a href="" rel="nofollow"></a> but modified slightly.</p> <p>Splits any string on upper case characters.</p> <p>Ex.</p> <pre class="prettyprint"><code>&gt;&gt;&gt; print split_uppercase("thisIsIt and SoIsThis") this Is It and So Is This </code></pre> <p>note the two spaces after 'and'</p> Text Editor in Python 3.3 (Python) 2013-06-19T15:58:17-07:00Stephen Chappell <p style="color: grey"> Python recipe 578569 by <a href="/recipes/users/2608421/">Stephen Chappell</a> (<a href="/recipes/tags/editor/">editor</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/text/">text</a>, <a href="/recipes/tags/text_processing/">text_processing</a>). </p> <p>This is a simple text editor written in Python using <code>tkinter</code> for graphics.</p> <p>Check out Captain DeadBones' <a href="">Chronicles</a> blog.</p> Batch conversion of text files to PDF with fileinput and xtopdf (Python) 2016-11-07T20:28:01-08:00Vasudev Ram <p style="color: grey"> Python recipe 580715 by <a href="/recipes/users/4173351/">Vasudev Ram</a> (<a href="/recipes/tags/batch/">batch</a>, <a href="/recipes/tags/batchmode/">batchmode</a>, <a href="/recipes/tags/conversion/">conversion</a>, <a href="/recipes/tags/files/">files</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pdfwriter/">pdfwriter</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/text/">text</a>, <a href="/recipes/tags/text_processing/">text_processing</a>, <a href="/recipes/tags/utilities/">utilities</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>). </p> <p>This recipe shows how to do a batch conversion of the content of multiple text files into a single PDF file, with a) an automatic page break after the content of each text file (in the PDF output), b) page numbering, and c) a header and footer on each page.</p> <p>It uses the fileinput module (part of the Python standard library), and xtopdf, a Python library for conversion of other formats to PDF.</p> <p>xtopdf is available here: <a href="" rel="nofollow"></a></p> <p>and a guide to installing and using xtopdf is here:</p> <p><a href="" rel="nofollow"></a></p> <p>Here is a sample run of the program:</p> <p>python BTTP123.pdf text1.txt text2.txt text3.txt</p> <p>This will read the content from the three text files specified and write it into the PDF file specified, neatly formatted.</p> Convert wildcard text files to PDF with xtopdf (e.g. report*.txt) (Python) 2016-12-06T20:37:30-08:00Vasudev Ram <p style="color: grey"> Python recipe 580727 by <a href="/recipes/users/4173351/">Vasudev Ram</a> (<a href="/recipes/tags/conversion/">conversion</a>, <a href="/recipes/tags/files/">files</a>, <a href="/recipes/tags/globbing/">globbing</a>, <a href="/recipes/tags/patterns/">patterns</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/pdfwriter/">pdfwriter</a>, <a href="/recipes/tags/pdf_generation/">pdf_generation</a>, <a href="/recipes/tags/text_processing/">text_processing</a>, <a href="/recipes/tags/wildcard/">wildcard</a>, <a href="/recipes/tags/xtopdf/">xtopdf</a>). </p> <p>This recipe shows how to convert all text files matching a filename wildcard to PDF, using the xtopdf PDF creation toolkit. For example, if you specify report<em>.txt as the wildcard, all files in the current directory that match report</em>.txt, will be converted to PDF, each in a separate PDF file. The original text files are not changed.</p> <p>Here is a guide to installing and using xtopdf:</p> <p><a href="" rel="nofollow"></a></p> <p>More details on running the program, and sample output, are available here:</p> <p><a href="" rel="nofollow"></a></p> State Machine for Text Processing (Python) 2009-01-21T14:01:23-08:00Jack Trainor <p style="color: grey"> Python recipe 576624 by <a href="/recipes/users/4076953/">Jack Trainor</a> (<a href="/recipes/tags/state_machine/">state_machine</a>, <a href="/recipes/tags/text_processing/">text_processing</a>). </p> <p>General state machine mechanism plus a specialized version, LineStateMachine, for processing text files based by using regular expressions to determine state transitions.</p> Extracting structured text or code (Python) 2011-05-18T13:04:01-07:00Mike Sweeney <p style="color: grey"> Python recipe 577700 by <a href="/recipes/users/4177990/">Mike Sweeney</a> (<a href="/recipes/tags/parsing/">parsing</a>, <a href="/recipes/tags/structured/">structured</a>, <a href="/recipes/tags/text_processing/">text_processing</a>, <a href="/recipes/tags/token/">token</a>). Revision 2. </p> <p>This function uses the power of regular expressions to extract parts of a structured text string. It can build a token list from many types of code and data formats. It finds string types (with quotes) and nested structures that use parentheses, brackets, and braces. If you need to extract a different syntax, you can provide a custom token pattern in the function arguments.</p> grade keeper (Python) 2009-01-12T09:38:11-08:00Caleb Herbert <p style="color: grey"> Python recipe 543261 by <a href="/recipes/users/4118572/">Caleb Herbert</a> (<a href="/recipes/tags/easy/">easy</a>, <a href="/recipes/tags/grades/">grades</a>, <a href="/recipes/tags/homework/">homework</a>, <a href="/recipes/tags/records/">records</a>, <a href="/recipes/tags/school/">school</a>, <a href="/recipes/tags/simple/">simple</a>, <a href="/recipes/tags/text/">text</a>, <a href="/recipes/tags/text_processing/">text_processing</a>). Revision 3. </p> <p>This code is was my first attempt at making a useful program. What it does is store grades in a text file after asking you a few questions like what subject, number of questions right, et cetera.</p> The Bentley-Knuth problem and solutions (Python) 2014-03-15T23:46:59-07:00Vasudev Ram <p style="color: grey"> Python recipe 578851 by <a href="/recipes/users/4173351/">Vasudev Ram</a> (<a href="/recipes/tags/algorithms/">algorithms</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/text_processing/">text_processing</a>). </p> <p>This is a program Jon Bentley asked Donald Knuth to write, and is one that’s become familiar to people who use languages with serious text-handling capabilities: Read a file of text, determine the n most frequently used words, and print out a sorted list of those words along with their frequencies. I wrote 2 solutions for it earlier, in Python and in Unix shell. Also see the comment by a user on the post, giving another solution.</p> Print selected text pages to PDF with Python, selpg and xtopdf on Linux (Bash) 2014-10-29T17:38:10-07:00Vasudev Ram <p style="color: grey"> Bash recipe 578954 by <a href="/recipes/users/4173351/">Vasudev Ram</a> (<a href="/recipes/tags/bash/">bash</a>, <a href="/recipes/tags/linux/">linux</a>, <a href="/recipes/tags/pdf/">pdf</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/reportlab/">reportlab</a>, <a href="/recipes/tags/shell/">shell</a>, <a href="/recipes/tags/text/">text</a>, <a href="/recipes/tags/text_files/">text_files</a>, <a href="/recipes/tags/text_processing/">text_processing</a>, <a href="/recipes/tags/unix/">unix</a>). </p> <p>This recipe shows how to use selpg, a Linux command-line utility written in C, together with xtopdf, a Python toolkit for PDF creation, to print only a selected range of pages from a text file, to a PDF file, for display or print purposes. The way to do this is to run the selpg utility at the Linux command line, with options specifying the start and end pages of the range, and pipe its output to the program, which is a part of the xtopdf toolkit.</p> uniform matcher( "re pattern" / re / func / dict / list / tuple / set ) (Python) 2009-05-06T06:17:16-07:00denis <p style="color: grey"> Python recipe 576741 by <a href="/recipes/users/4168005/">denis</a> (<a href="/recipes/tags/grep/">grep</a>, <a href="/recipes/tags/re/">re</a>, <a href="/recipes/tags/text_processing/">text_processing</a>, <a href="/recipes/tags/uniform/">uniform</a>). Revision 2. </p> <p>matcher() makes a string matcher function from any of:</p> <ul> <li>"RE pattern string"</li> <li>re.compile()</li> <li>a function, i.e. callable</li> <li>a dict / list / tuple / set / container</li> </ul> <p>This uniformity is simple, useful, a Good Thing.</p> <p>A few example functions using matchers are here too: grep getfields kwgrep.</p> Routine to i18nify any word (Python) 2016-05-19T18:41:26-07:00Vasudev Ram <p style="color: grey"> Python recipe 580662 by <a href="/recipes/users/4173351/">Vasudev Ram</a> (<a href="/recipes/tags/i18n/">i18n</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/strings/">strings</a>, <a href="/recipes/tags/text_processing/">text_processing</a>). </p> <p>This recipe shows a routine and a driver program that lets you "i18nify" any word, similar to how the word "internationalization" is shortened to "i18n", and "localization" to "l10n".</p> [python3-tk/ttk] Onager Scratchpad (Python) 2016-04-24T02:34:03-07:00Mickey Kocic <p style="color: grey"> Python recipe 580650 by <a href="/recipes/users/4193984/">Mickey Kocic</a> (<a href="/recipes/tags/python3/">python3</a>, <a href="/recipes/tags/text_processing/">text_processing</a>, <a href="/recipes/tags/tkinter/">tkinter</a>, <a href="/recipes/tags/ttk/">ttk</a>, <a href="/recipes/tags/windows/">windows</a>). Revision 2. </p> <p>I wrote this simple text editor to use for my diary. It's customized the way I like it, but the code is set up so it's easy for other people to change bg, fg, font, etc. I've also compiled a standalone Windows executable (thank you very much ActiveState! without ActivePython the compilation would have been impossible). Anyone who wants a copy of the executable is free to message or email me.</p> <p>NOTE: If you get an error that the theme is not recognized, just comment out line 18 or run the following code in your python3 interpreter:</p> <pre class="prettyprint"><code>&gt;&gt;&gt;from tkinter.ttk import Style &gt;&gt;&gt;s = Style() &gt;&gt;&gt;s.theme_use() </code></pre> <p>You'll get a list of the available themes and can replace the 'alt' in line 18 with any one of them you want.</p> Simple tabulator (Python) 2010-11-09T12:50:06-08:00Noufal Ibrahim <p style="color: grey"> Python recipe 577458 by <a href="/recipes/users/4173873/">Noufal Ibrahim</a> (<a href="/recipes/tags/parsing/">parsing</a>, <a href="/recipes/tags/text_processing/">text_processing</a>). </p> <p>This is a simple script to covert a top to bottom list of items into a left to right list.</p> <pre class="prettyprint"><code> a b c d e f g h i j k l m </code></pre> <p>into</p> <pre class="prettyprint"><code> a b c d e f g h i j k l m </code></pre> <p>A few command line options allow some amount of customisation. </p> (Regex based simple parsing engine) (Python) 2013-05-26T18:00:58-07:00Mike 'Fuzzy' Partin <p style="color: grey"> Python recipe 578532 by <a href="/recipes/users/4179778/">Mike 'Fuzzy' Partin</a> (<a href="/recipes/tags/parser/">parser</a>, <a href="/recipes/tags/regex/">regex</a>, <a href="/recipes/tags/text_processing/">text_processing</a>). </p> <p>A parsing engine that allows you to define sets of patterns and callbacks, and process any I/O object in Pyton that has a readline() method.</p> Text Model (Python) 2015-01-13T22:56:53-08:00Chris Ecker <p style="color: grey"> Python recipe 577978 by <a href="/recipes/users/4180203/">Chris Ecker</a> (<a href="/recipes/tags/datastuctures/">datastuctures</a>, <a href="/recipes/tags/text_processing/">text_processing</a>, <a href="/recipes/tags/tree/">tree</a>). Revision 3. </p> <p>A tree data type holding text data together with styling information. </p>