Popular Python recipes tagged "regex"http://code.activestate.com/recipes/langs/python/tags/regex/2016-04-28T13:24:15-07:00ActiveState Code RecipesRecursive find replace in files using regex (Python)
2016-04-28T13:24:15-07:00ccpizzahttp://code.activestate.com/recipes/users/4170754/http://code.activestate.com/recipes/580653-recursive-find-replace-in-files-using-regex/
<p style="color: grey">
Python
recipe 580653
by <a href="/recipes/users/4170754/">ccpizza</a>
(<a href="/recipes/tags/files/">files</a>, <a href="/recipes/tags/regex/">regex</a>, <a href="/recipes/tags/replace/">replace</a>, <a href="/recipes/tags/search/">search</a>).
Revision 5.
</p>
<h4 id="recursively-find-and-replace-text-in-files-under-a-specific-folder-with-preview-of-changed-data-in-dry-run-mode">Recursively find and replace text in files under a specific folder with preview of changed data in dry-run mode</h4>
<h5 id="example-usage">Example Usage</h5>
<p><strong>See what is going to change (dry run):</strong></p>
<pre class="prettyprint"><code>find_replace.py --dir project/myfolder --search-regex "\d{4}-\d{2}-\d{2}" --replace-regex "2012-12-12" --dryrun
</code></pre>
<p><strong>Do actual replacement:</strong></p>
<pre class="prettyprint"><code>find_replace.py --dir project/myfolder --search-regex "\d{4}-\d{2}-\d{2}" --replace-regex "2012-12-12"
</code></pre>
<p><strong>Do actual replacement and create backup files:</strong></p>
<pre class="prettyprint"><code>find_replace.py --dir project/myfolder --search-regex "\d{4}-\d{2}-\d{2}" --replace-regex "2012-12-12" --create-backup
</code></pre>
<p><strong>Same action as previous command with short-hand syntax:</strong></p>
<pre class="prettyprint"><code>find_replace.py -d project/myfolder -s "\d{4}-\d{2}-\d{2}" -r "2012-12-12" -b
</code></pre>
<p>Output of <code>find_replace.py -h</code>:</p>
<pre class="prettyprint"><code>DESCRIPTION:
Find and replace recursively from the given folder using regular expressions
optional arguments:
-h, --help show this help message and exit
--dir DIR, -d DIR folder to search in; by default current folder
--search-regex SEARCH_REGEX, -s SEARCH_REGEX
search regex
--replace-regex REPLACE_REGEX, -r REPLACE_REGEX
replacement regex
--glob GLOB, -g GLOB glob pattern, i.e. *.html
--dryrun, -dr don't replace anything just show what is going to be
done
--create-backup, -b Create backup files
USAGE:
find_replace.py -d [my_folder] -s <search_regex> -r <replace_regex> -g [glob_pattern]
</code></pre>
Recapitalize word in the beginning of every sentence (Python)
2014-01-05T14:11:00-08:00sky kokhttp://code.activestate.com/recipes/users/4188838/http://code.activestate.com/recipes/578796-recapitalize-word-in-the-beginning-of-every-senten/
<p style="color: grey">
Python
recipe 578796
by <a href="/recipes/users/4188838/">sky kok</a>
(<a href="/recipes/tags/regex/">regex</a>).
Revision 2.
</p>
<p>Recapitalizes text, placing caps after end-of-sentence punctuation. Turning "hello world. how are you?" to "Hello world. How are you?"</p>
slurp.py (Regex based simple parsing engine) (Python)
2013-05-26T18:00:58-07:00Mike 'Fuzzy' Partinhttp://code.activestate.com/recipes/users/4179778/http://code.activestate.com/recipes/578532-slurppy-regex-based-simple-parsing-engine/
<p style="color: grey">
Python
recipe 578532
by <a href="/recipes/users/4179778/">Mike 'Fuzzy' Partin</a>
(<a href="/recipes/tags/parser/">parser</a>, <a href="/recipes/tags/regex/">regex</a>, <a href="/recipes/tags/text_processing/">text_processing</a>).
</p>
<p>A parsing engine that allows you to define sets of patterns and callbacks, and process any I/O object in Pyton that has a readline() method.</p>
Minimalistic Word Wrap using Regex (Python)
2012-06-07T14:31:44-07:00Alfehttp://code.activestate.com/recipes/users/4182236/http://code.activestate.com/recipes/578162-minimalistic-word-wrap-using-regex/
<p style="color: grey">
Python
recipe 578162
by <a href="/recipes/users/4182236/">Alfe</a>
(<a href="/recipes/tags/minimalistic/">minimalistic</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/regex/">regex</a>, <a href="/recipes/tags/word/">word</a>, <a href="/recipes/tags/wrap/">wrap</a>).
</p>
<p>I know there is the module textwrap and other recipes like <a href="http://code.activestate.com/recipes/148061-one-liner-word-wrap-function/" rel="nofollow">http://code.activestate.com/recipes/148061-one-liner-word-wrap-function/</a></p>
<p>But in some constellations my recipe for a very simple word wrap might come in handy nevertheless.</p>
Cheap-date trick; a different way to parse (Python)
2012-03-06T14:08:10-08:00Scott S-Allenhttp://code.activestate.com/recipes/users/4181178/http://code.activestate.com/recipes/578064-cheap-date-trick-a-different-way-to-parse/
<p style="color: grey">
Python
recipe 578064
by <a href="/recipes/users/4181178/">Scott S-Allen</a>
(<a href="/recipes/tags/cheap/">cheap</a>, <a href="/recipes/tags/date/">date</a>, <a href="/recipes/tags/format/">format</a>, <a href="/recipes/tags/grep/">grep</a>, <a href="/recipes/tags/parse/">parse</a>, <a href="/recipes/tags/regex/">regex</a>, <a href="/recipes/tags/sharp/">sharp</a>).
</p>
<p>... a light meal with a heavy dose of "tutorial mash" on the side.</p>
<p>In the constructive spirit of "more ways to solve a problem"; this is a portion of my lateral, occasionally oblique, solutions. Nothing new in le régime de grande, but hopefully the conceptual essence will amuse.</p>
<p>Initially started as a response to <a href="http://code.activestate.com/recipes/577135/">recipe 577135</a> which parses incremental date fragments and preserves micro-seconds where available. That script does more work than this, for sure, but requires special flow-control and iterates a potentially incumbering shopping list (multi-dimensional with some detail).</p>
<p>So here's a different box for others to play with. Upside-down in a sense, it doesn't hunt for anything but a numerical "pulse"; sequences of digits punctuated by other 'stuff' we don't much care about.</p>
<p>Missing a lot of things, intentionally, this snippet provides several examples demoin' flexibility. Easy to button-up, redecorate and extend later for show, till then the delightful commentary makes it hard enough to see bones already -- all six lines or so!</p>
<p><strong>Note:</strong> <em>The core script is repeated for illustrative purposes. The first is step-by-step, the second is lean and condensed for utilitarian purposes. It is the second, shorter, version that I yanked from a file and gussied up.</em></p>
lreplace() and rreplace(): Replace the beginning and ends of a strings (Python)
2010-06-04T02:18:48-07:00Dan McDougallhttp://code.activestate.com/recipes/users/4169722/http://code.activestate.com/recipes/577252-lreplace-and-rreplace-replace-the-beginning-and-en/
<p style="color: grey">
Python
recipe 577252
by <a href="/recipes/users/4169722/">Dan McDougall</a>
(<a href="/recipes/tags/regex/">regex</a>, <a href="/recipes/tags/replace/">replace</a>, <a href="/recipes/tags/string/">string</a>).
</p>
<p>Python newbies will often make the following mistake (I certainly have =):</p>
<pre class="prettyprint"><code>>>> test = """this is a test:
... tis the season for mistakes."""
>>> for line in test.split('\n'):
... print line.lstrip('this')
...
is a test
the season for mistakes.
</code></pre>
<p>The mistake is assuming that lstrip() (or rstrip()) strips a string (whole) when it actually strips all of the provided characters in the given string. Python actually comes with no function to strip a string from the left-hand or right-hand side of a string so I wrote this (very simple) recipe to solve that problem. Here's the usage:</p>
<pre class="prettyprint"><code>>>> test = """this is a test:
... tis the season for mistakes."""
>>> for line in test.split('\n'):
... print lreplace('this', '', line)
...
is a test
tis the season for mistakes.
</code></pre>
<p>I really wish Python had these functions built into the string object. I think it would be a useful addition to the standard library. It would also be nicer to type this:</p>
<pre class="prettyprint"><code>line.lreplace('this', '')
</code></pre>
<p>Instead of this:</p>
<pre class="prettyprint"><code>lreplace('this','',line)
</code></pre>
Python Easily Packetize / Slice / Chunk Text (Python)
2011-09-30T05:34:57-07:00__nerohttp://code.activestate.com/recipes/users/4177968/http://code.activestate.com/recipes/577885-python-easily-packetize-slice-chunk-text/
<p style="color: grey">
Python
recipe 577885
by <a href="/recipes/users/4177968/">__nero</a>
(<a href="/recipes/tags/chunk/">chunk</a>, <a href="/recipes/tags/packetize/">packetize</a>, <a href="/recipes/tags/regex/">regex</a>, <a href="/recipes/tags/split/">split</a>, <a href="/recipes/tags/udp/">udp</a>).
</p>
<p>I needed to chunk up some text to send over UDP and didn't want to have messy for loops with an if condition for size and then the little bit left over. All that struck me as very messy. I then thought of the re module and came up with a very simple solution to chunk up data.</p>
url_spider (Python)
2011-03-14T09:08:28-07:00amir naghavihttp://code.activestate.com/recipes/users/4177294/http://code.activestate.com/recipes/577608-url_spider/
<p style="color: grey">
Python
recipe 577608
by <a href="/recipes/users/4177294/">amir naghavi</a>
(<a href="/recipes/tags/database/">database</a>, <a href="/recipes/tags/regex/">regex</a>, <a href="/recipes/tags/web/">web</a>).
Revision 3.
</p>
<p>a simple url spider that goes through web pages and collects urls.</p>
Simple Regular Expression Tester (Python)
2010-12-25T00:12:44-08:00Sunjay Varmahttp://code.activestate.com/recipes/users/4174115/http://code.activestate.com/recipes/577517-simple-regular-expression-tester/
<p style="color: grey">
Python
recipe 577517
by <a href="/recipes/users/4174115/">Sunjay Varma</a>
(<a href="/recipes/tags/command/">command</a>, <a href="/recipes/tags/debugging/">debugging</a>, <a href="/recipes/tags/line/">line</a>, <a href="/recipes/tags/python/">python</a>, <a href="/recipes/tags/quick/">quick</a>, <a href="/recipes/tags/re/">re</a>, <a href="/recipes/tags/regex/">regex</a>, <a href="/recipes/tags/testing/">testing</a>).
Revision 2.
</p>
<p><strong><em>Is it possible to create a simple command line program that I can use to quickly test my regular expressions?</em></strong></p>
<p>Yes it is. This simple regular expression tester uses Python's re module as well as others to quickly allow you to test your regular expression in one go.</p>
<p>TODO:</p>
<ul>
<li>Add Support For Multiple Regular Expression Input</li>
</ul>
<p>Recent Changes:</p>
<ul>
<li>Made the output prettier with a little more whitespace. More bytes, but at least it's easier to read!</li>
</ul>
tryout regex (Python)
2010-12-17T17:28:26-08:00Lost Protocolhttp://code.activestate.com/recipes/users/4176279/http://code.activestate.com/recipes/577503-tryout-regex/
<p style="color: grey">
Python
recipe 577503
by <a href="/recipes/users/4176279/">Lost Protocol</a>
(<a href="/recipes/tags/regex/">regex</a>).
Revision 2.
</p>
<p>Teaches you regex in python by trying it out. </p>
<pre class="prettyprint"><code>>>> only one command called compile allows you to compile a certain string
>>> any other input is treated as pattern to match
</code></pre>
Clean a .py file full of constant chr(x) calls (Python)
2010-04-02T08:20:50-07:00Marcelo Fernándezhttp://code.activestate.com/recipes/users/4173551/http://code.activestate.com/recipes/577175-clean-a-py-file-full-of-constant-chrx-calls/
<p style="color: grey">
Python
recipe 577175
by <a href="/recipes/users/4173551/">Marcelo Fernández</a>
(<a href="/recipes/tags/chr/">chr</a>, <a href="/recipes/tags/cleaner/">cleaner</a>, <a href="/recipes/tags/cleaning/">cleaning</a>, <a href="/recipes/tags/regex/">regex</a>).
</p>
<p>This script identifies every chr(xx) call in a script (being xx an integer) and replaces it with a constant byte string. For example: print chr(13) + chr(255) in the input .py file gets translated into '\n' + '\xff' on the output .py file, not breaking the input program, and maybe speeding it a little.</p>
Python code minifier (Python)
2014-05-25T16:23:55-07:00Dan McDougallhttp://code.activestate.com/recipes/users/4169722/http://code.activestate.com/recipes/576704-python-code-minifier/
<p style="color: grey">
Python
recipe 576704
by <a href="/recipes/users/4169722/">Dan McDougall</a>
(<a href="/recipes/tags/bz2/">bz2</a>, <a href="/recipes/tags/bzip2/">bzip2</a>, <a href="/recipes/tags/comments/">comments</a>, <a href="/recipes/tags/compression/">compression</a>, <a href="/recipes/tags/docstring/">docstring</a>, <a href="/recipes/tags/embedded/">embedded</a>, <a href="/recipes/tags/gzip/">gzip</a>, <a href="/recipes/tags/minify/">minify</a>, <a href="/recipes/tags/pack/">pack</a>, <a href="/recipes/tags/regex/">regex</a>, <a href="/recipes/tags/zlib/">zlib</a>).
Revision 16.
</p>
<p><strong>Update 05/25/2014:</strong> Pyminifier 2.0 has been released and now lives on Github: <a href="https://github.com/liftoff/pyminifier" rel="nofollow">https://github.com/liftoff/pyminifier</a> (docs are here: <a href="http://liftoff.github.io/pyminifier/" rel="nofollow">http://liftoff.github.io/pyminifier/</a>). The code below is very out-of-date but will be left alone for historical purposes.</p>
<p>Python Minifier: Reduces the size of Python code for use on embedded platforms. Performs the following:</p>
<ol>
<li>Removes docstrings.</li>
<li>Removes comments.</li>
<li>Removes blank lines.</li>
<li>Minimizes code indentation.</li>
<li>Joins multiline pairs of parentheses, braces, and brackets (and removes extraneous whitespace within).</li>
<li>Preserves shebangs and encoding info (e.g. "# -<em>- coding: utf-8 -</em>-")</li>
<li><strong>NEW:</strong> Optionally, produces a bzip2 or gzip-compressed self-extracting python script containing the minified source for ultimate minification.</li>
</ol>
<p><strong>Update 09/23/2010:</strong> Version 1.4.1: Fixed an indentation bug when operators such as @ and open parens started a line.</p>
<p><strong>Update 09/18/2010:</strong> Version 1.4:</p>
<ul>
<li>Added some command line options to save the result to an output file.</li>
<li>Added the ability to save the result as a bzip2 or gzip-compressed self-extracting python script (which is kinda neat--try it!).</li>
<li>Updated some of the docstrings to provide more examples of what each function does.</li>
</ul>
<p><strong>Update 06/02/2010:</strong> Version 1.3: Rewrote several functions to use Python's built-in tokenizer module (which I just discovered despite being in Python since version 2.2). This negated the requirement for pyparsing and improved performance by an order of magnitude. It also fixed some pretty serious bugs with dedent() and reduce_operators().</p>
<p>PLEASE POST A COMMENT IF YOU ENCOUNTER A BUG!</p>
Automagically dispatch commands using regex token classes (Python)
2009-09-28T03:46:29-07:00Mick Krippendorfhttp://code.activestate.com/recipes/users/4171813/http://code.activestate.com/recipes/576914-automagically-dispatch-commands-using-regex-token-/
<p style="color: grey">
Python
recipe 576914
by <a href="/recipes/users/4171813/">Mick Krippendorf</a>
(<a href="/recipes/tags/command_dispatch_pattern/">command_dispatch_pattern</a>, <a href="/recipes/tags/re/">re</a>, <a href="/recipes/tags/regex/">regex</a>).
Revision 2.
</p>
<p>The <em>(?P<...>...)</em> notation in Python's regular expressions can be viewed as a classification of matched tokens. The names of these classes can be used to dispatch tokens to appropriate handlers:</p>
Dict-like string formatting from an object by named atttributes (Python)
2009-08-17T12:59:16-07:00Jacob Oscarsonhttp://code.activestate.com/recipes/users/1355144/http://code.activestate.com/recipes/576883-dict-like-string-formatting-from-an-object-by-name/
<p style="color: grey">
Python
recipe 576883
by <a href="/recipes/users/1355144/">Jacob Oscarson</a>
(<a href="/recipes/tags/regex/">regex</a>, <a href="/recipes/tags/text/">text</a>).
</p>
<p>Formats a string using the same notation as a dict-expansion, but
using an object as the source for the expansion instead. Attributes
from the object are used as source to the expansions.</p>
extract table into 2-vector from html page (Python)
2008-09-03T22:22:42-07:00devdoerhttp://code.activestate.com/recipes/users/4166883/http://code.activestate.com/recipes/576485-extract-table-into-2-vector-from-html-page/
<p style="color: grey">
Python
recipe 576485
by <a href="/recipes/users/4166883/">devdoer</a>
(<a href="/recipes/tags/html/">html</a>, <a href="/recipes/tags/parse/">parse</a>, <a href="/recipes/tags/regex/">regex</a>).
</p>
<p>extract table into 2-vector from html page</p>
transform a text to another by regex (Python)
2008-09-02T20:30:53-07:00devdoerhttp://code.activestate.com/recipes/users/4166883/http://code.activestate.com/recipes/576481-transform-a-text-to-another-by-regex/
<p style="color: grey">
Python
recipe 576481
by <a href="/recipes/users/4166883/">devdoer</a>
(<a href="/recipes/tags/parse/">parse</a>, <a href="/recipes/tags/regex/">regex</a>).
</p>
<p>transform a text to another by regex</p>