| Store | Cart

Re: [Python-ideas] difflib.SequenceMatcher quick_ratio

From: Andrew Barnert via Python-ideas <pyth...@python.org>
Mon, 8 Jun 2015 06:31:34 -0700
If this really is needed as a performance optimization, surely you want to do something faster than loop over dozens of comparisons to decide whether you can skip the actual work?

I don't know if this is something you can calculate analytically, but if not, you're presumably doing this on zillions of lines, and instead of repeating the loop every time, wouldn't it be better to just do it once and then just check the ratio each time? (You could hide that from the caller by just factoring out the loop to a function _get_ratio_for_threshold and decorating it with @lru_cache. But I don't know if you really need to hide it from the caller.)

Also, do the extra checks for 0, 1, and 0.1 and for empty strings actually speed things up in practice?

> On Jun 8, 2015, at 00:56, floyd <flo...@floyd.ch> wrote:> > Hi *> > I use this python line quite a lot in some projects:> > if difflib.SequenceMatcher.quick_ratio(None, a, b) >= threshold:> > I realized that this is performance-wise not optimal, therefore wrote a> method that will return much faster in a lot of cases by using the> length of "a" and "b" to calculate the upper bound for "threshold":> > if difflib.SequenceMatcher.quick_ratio_ge(None, a, b, threshold):> > I'd say we could include it into the stdlib, but maybe it should only be> a python code recipe?> > I would say this is one of the most frequent use cases for difflib, but> maybe that's just my biased opinion :) . What's yours?> > See http://bugs.python.org/issue24384> > cheers,> floyd> _______________________________________________> Python-ideas mailing list> Pyth...@python.org> https://mail.python.org/mailman/listinfo/python-ideas> Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________
Python-ideas mailing list
Pyth...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Recent Messages in this Thread
floyd Jun 08, 2015 07:56 am
Serhiy Storchaka Jun 08, 2015 08:44 am
Tal Einat Jun 08, 2015 03:15 pm
Matthias Bussonnier Jun 08, 2015 03:39 pm
Andrew Barnert via Python-ideas Jun 08, 2015 01:31 pm
Messages in this thread