Popular Python recipes tagged "compression"http://code.activestate.com/recipes/langs/python/tags/compression/2014-05-25T16:23:55-07:00ActiveState Code RecipesText Compressor 3.1 (Python)
2010-10-20T00:50:15-07:00Stephen Chappellhttp://code.activestate.com/recipes/users/2608421/http://code.activestate.com/recipes/577433-text-compressor-31/
<p style="color: grey">
Python
recipe 577433
by <a href="/recipes/users/2608421/">Stephen Chappell</a>
(<a href="/recipes/tags/compression/">compression</a>, <a href="/recipes/tags/encode/">encode</a>, <a href="/recipes/tags/encryption/">encryption</a>).
</p>
<p>Compression, encryption, and data codecs are all related fields that most programmers will use ready-made solutions for. This recipe is a shallow adventure into the writing of original code and algorithms that explores a combination of those topics. Based on the work of <a href="http://code.activestate.com/recipes/502202/">recipe 502202</a>, the code here is compliant with Python 3.1 and will run a test of itself when executed. From the program's report, one can gather that the novel procedures compress the source and accurately decompress it again. For those who wish to experiment further with the concept, note that fewer unique characters will yield higher compression ratios.</p>
Creating a tar archive (without using the tarfile module) (Python)
2010-10-11T06:18:42-07:00Benjamin Sergeanthttp://code.activestate.com/recipes/users/4039626/http://code.activestate.com/recipes/577422-creating-a-tar-archive-without-using-the-tarfile-m/
<p style="color: grey">
Python
recipe 577422
by <a href="/recipes/users/4039626/">Benjamin Sergeant</a>
(<a href="/recipes/tags/compression/">compression</a>, <a href="/recipes/tags/tar/">tar</a>, <a href="/recipes/tags/unix/">unix</a>).
</p>
<p>Creating a tar file is easy if you read the spec (you can look it up on wikipedia). Not every kind of files are supported (it support regular files, folders ans symlinks) and it's generating archives for the original tar file format (path length are limited to 100 chars, no extended attributes, ...). It wasn't tested very much but it was a fun hack :) ... I cheated just a little by looking at the python tarfile code from the stdlib for the checksum computation.</p>
<p>A tar file is very simple, it's a list of header/payload for each entry (file|folder|symlink) you want to archive. There's only a payload for file contents. The header is 512 bytes long and can be written in ascii. Numbers (attributes) needs to be written in octal. The files themselves needs to be written in chunks of 512 bytes, which mean you have to fill the last chunk with zeros when the file size is not a multiple of 512 bytes.</p>
<p>Use it like that: </p>
<pre class="prettyprint"><code>python batar.py /tmp/foo.tar `find .` && tar tf /tmp/foo.tar # or xf if you want to extract it
</code></pre>
Python code minifier (Python)
2014-05-25T16:23:55-07:00Dan McDougallhttp://code.activestate.com/recipes/users/4169722/http://code.activestate.com/recipes/576704-python-code-minifier/
<p style="color: grey">
Python
recipe 576704
by <a href="/recipes/users/4169722/">Dan McDougall</a>
(<a href="/recipes/tags/bz2/">bz2</a>, <a href="/recipes/tags/bzip2/">bzip2</a>, <a href="/recipes/tags/comments/">comments</a>, <a href="/recipes/tags/compression/">compression</a>, <a href="/recipes/tags/docstring/">docstring</a>, <a href="/recipes/tags/embedded/">embedded</a>, <a href="/recipes/tags/gzip/">gzip</a>, <a href="/recipes/tags/minify/">minify</a>, <a href="/recipes/tags/pack/">pack</a>, <a href="/recipes/tags/regex/">regex</a>, <a href="/recipes/tags/zlib/">zlib</a>).
Revision 16.
</p>
<p><strong>Update 05/25/2014:</strong> Pyminifier 2.0 has been released and now lives on Github: <a href="https://github.com/liftoff/pyminifier" rel="nofollow">https://github.com/liftoff/pyminifier</a> (docs are here: <a href="http://liftoff.github.io/pyminifier/" rel="nofollow">http://liftoff.github.io/pyminifier/</a>). The code below is very out-of-date but will be left alone for historical purposes.</p>
<p>Python Minifier: Reduces the size of Python code for use on embedded platforms. Performs the following:</p>
<ol>
<li>Removes docstrings.</li>
<li>Removes comments.</li>
<li>Removes blank lines.</li>
<li>Minimizes code indentation.</li>
<li>Joins multiline pairs of parentheses, braces, and brackets (and removes extraneous whitespace within).</li>
<li>Preserves shebangs and encoding info (e.g. "# -<em>- coding: utf-8 -</em>-")</li>
<li><strong>NEW:</strong> Optionally, produces a bzip2 or gzip-compressed self-extracting python script containing the minified source for ultimate minification.</li>
</ol>
<p><strong>Update 09/23/2010:</strong> Version 1.4.1: Fixed an indentation bug when operators such as @ and open parens started a line.</p>
<p><strong>Update 09/18/2010:</strong> Version 1.4:</p>
<ul>
<li>Added some command line options to save the result to an output file.</li>
<li>Added the ability to save the result as a bzip2 or gzip-compressed self-extracting python script (which is kinda neat--try it!).</li>
<li>Updated some of the docstrings to provide more examples of what each function does.</li>
</ul>
<p><strong>Update 06/02/2010:</strong> Version 1.3: Rewrote several functions to use Python's built-in tokenizer module (which I just discovered despite being in Python since version 2.2). This negated the requirement for pyparsing and improved performance by an order of magnitude. It also fixed some pretty serious bugs with dedent() and reduce_operators().</p>
<p>PLEASE POST A COMMENT IF YOU ENCOUNTER A BUG!</p>
Huffman coding, Encoder/Deconder (Python)
2009-01-04T04:11:37-08:00Shao-chuan Wanghttp://code.activestate.com/recipes/users/4168519/http://code.activestate.com/recipes/576603-huffman-coding-encoderdeconder/
<p style="color: grey">
Python
recipe 576603
by <a href="/recipes/users/4168519/">Shao-chuan Wang</a>
(<a href="/recipes/tags/algorithm/">algorithm</a>, <a href="/recipes/tags/compression/">compression</a>, <a href="/recipes/tags/decompression/">decompression</a>, <a href="/recipes/tags/huffman_code/">huffman_code</a>).
Revision 2.
</p>
<p>Please refer to wikipedia: <a href="http://en.wikipedia.org/wiki/Huffman_coding" rel="nofollow">http://en.wikipedia.org/wiki/Huffman_coding</a></p>
<p>Huffman coding is an entropy encoding algorithm used for lossless data compression. The term refers to the use of a variable-length code table for encoding a source symbol (such as a character in a file) where the variable-length code table has been derived in a particular way based on the estimated probability of occurrence for each possible value of the source symbol. It was developed by David A. Huffman while he was a Ph.D. student at MIT, and published in the 1952 paper "A Method for the Construction of Minimum-Redundancy Codes".</p>
Serve static web content from within a gzipped tarball to save space using CherryPy (Python)
2009-03-31T18:24:06-07:00Dan McDougallhttp://code.activestate.com/recipes/users/4169722/http://code.activestate.com/recipes/576706-serve-static-web-content-from-within-a-gzipped-tar/
<p style="color: grey">
Python
recipe 576706
by <a href="/recipes/users/4169722/">Dan McDougall</a>
(<a href="/recipes/tags/cherrypy/">cherrypy</a>, <a href="/recipes/tags/compression/">compression</a>, <a href="/recipes/tags/embedded/">embedded</a>, <a href="/recipes/tags/gzip/">gzip</a>, <a href="/recipes/tags/html/">html</a>, <a href="/recipes/tags/http/">http</a>, <a href="/recipes/tags/network/">network</a>, <a href="/recipes/tags/routes/">routes</a>, <a href="/recipes/tags/web/">web</a>, <a href="/recipes/tags/web_server/">web_server</a>).
</p>
<p>This code lets you store all of your static website content inside a gzipped tarball while transparently serving it to HTTP clients on-demand. Perfect for embedded systems where space is limited.</p>