Popular Python recipes tagged "from"http://code.activestate.com/recipes/langs/python/tags/from/2010-05-02T13:21:00-07:00ActiveState Code RecipesFix mbox files after importing EML into TB using ImportExportTools (Python)
2010-05-02T13:21:00-07:00Denis Barmenkovhttp://code.activestate.com/recipes/users/57155/http://code.activestate.com/recipes/577214-fix-mbox-files-after-importing-eml-into-tb-using-i/
<p style="color: grey">
Python
recipe 577214
by <a href="/recipes/users/57155/">Denis Barmenkov</a>
(<a href="/recipes/tags/eml/">eml</a>, <a href="/recipes/tags/from/">from</a>, <a href="/recipes/tags/import/">import</a>, <a href="/recipes/tags/importexporttools/">importexporttools</a>, <a href="/recipes/tags/mbox/">mbox</a>, <a href="/recipes/tags/tb/">tb</a>, <a href="/recipes/tags/thunderbird/">thunderbird</a>).
Revision 2.
</p>
<p>I've found a bug in import EML file into Thunderbird using ImportExportTools addon:
when I import eml file into TB there are a 'From' line added to mbox followed with EML file contents.
TB maintains right 'From' line for messages fetched from mailservers:</p>
<pre class="prettyprint"><code>From - Tue Apr 27 19:42:22 2010
</code></pre>
<p>ImportExportTools formats this line wrong I suppose that used some system function with default specifier so I saw in mbox file:</p>
<pre class="prettyprint"><code>From - Sat May 01 2010 15:07:31 GMT+0400 (Russian Daylight Time)
</code></pre>
<p>So there are two errors:
1) sequence 'time year' broken into 'year time'
2) extra trash with GMT info along with time zone name</p>
<p>This prevents the mbox file parsing using Python standard library (for sample) because there are a hardcoded regexp for matching From line (file lib/mailbox.py, class UnixMailbox):</p>
<pre class="prettyprint"><code>_fromlinepattern = r"From \s*[^\s]+\s+\w\w\w\s+\w\w\w\s+\d?\d\s+" \
r"\d?\d:\d\d(:\d\d)?(\s+[^\s]+)?\s+\d\d\d\d\s*$"
</code></pre>
<p>Attached script fixes incorrect From lines so parsing those mboxes using Python standard library will become ok.</p>