Popular recipes by Douglas Bagnall http://code.activestate.com/recipes/users/1629020/2004-11-07T01:08:26-08:00ActiveState Code RecipesLanguage detection using character trigrams (Python)
2004-11-07T01:08:26-08:00Douglas Bagnallhttp://code.activestate.com/recipes/users/1629020/http://code.activestate.com/recipes/326576-language-detection-using-character-trigrams/
<p style="color: grey">
Python
recipe 326576
by <a href="/recipes/users/1629020/">Douglas Bagnall</a>
(<a href="/recipes/tags/algorithms/">algorithms</a>).
Revision 2.
</p>
<p>The Trigram class can be used to compare blocks of text based on their local structure, which is a good indicator of the language used. It could also be used within a language to discover and compare the characteristic footprints of various registers or authors. As all n-gram implementations should, it has a method to make up nonsense words.</p>
winnowing data with a heap. (Python)
2004-08-11T05:52:17-07:00Douglas Bagnallhttp://code.activestate.com/recipes/users/1629020/http://code.activestate.com/recipes/299058-winnowing-data-with-a-heap/
<p style="color: grey">
Python
recipe 299058
by <a href="/recipes/users/1629020/">Douglas Bagnall</a>
(<a href="/recipes/tags/algorithms/">algorithms</a>).
Revision 3.
</p>
<p>The winnow class uses a heap for finding the best few out
of several items. At this it is quicker and shorter than
python 2.3's heapq module, which is aimed at queuing rather
than sifting. OTOH, it is unlikely to have any advantage over
2.4's heapq, which (I hear) has expanded functionality and is
implemented in C.</p>