In article <9qh2i1-2up.ln1 at home.rogerbinns.com>,
"Roger Binns" <rogerb at rogerbinns.com> wrote:
> Roy Smith wrote:> > I find it interesting that many> > people who come from a C++ background (especially those who migrated> > from C) tend to dislike exceptions. It must have been something in the> > water when they were growing up :-)> > It also affects former Java weenies. As a Java programmer you expect> your code to be run under a JIT (and not interpretted) and you are> aware that exceptions are significantly more expensive than straight> forward code. This is less of an issue in Python as interpretters> don't tend to have as expensive exceptions compared to "normal" code.
But, exceptions are things that (generally) don't happen very often.
What difference does it make if something that doesn't happen very often
is slow?
What really tends to surprise C++ types is that in Python, using
exceptions is often FASTER than the alternative. For example, if most
accesses of a dictionary are for keys that are already there, it's
faster to just try it and catch the occasional KeyError than to check
before you try the access.
I wrote a little test program which reads a dictionary, generates the
leading digram of each word (i.e. the first two letters) then builds a
histogram of digram frequency. I build the histogram 10 times to dilute
out the time it takes to read the file in the first place.
Each of the 10 runs results in about 2 million dictionary accesses, of
which about 500 generate KeyErrors. I ran it using two algorithms; one
catches the KeyErrors and deals with problem, the other uses has_key()
to check first to avoid raising the exceptions in the first place.
Here's the results (see full code listing below):
Roy-Smiths-Computer:play$ time ./digram.py -h > has
real 0m7.633s
user 0m6.830s
sys 0m0.100s
Roy-Smiths-Computer:play$ time ./digram.py -e > ex
real 0m5.642s
user 0m4.970s
sys 0m0.120s
The output files (has and ex) compare identical. The has_key() version
took 6.8 seconds, the exception catching version took 5.0.
I think what's really interesting about this is that even in C++, where
exceptions are relatively costly, with a hit ratio of 500:2,000,000
(0.025%), I bet the exception catching version would still be faster.
---------digram.py----------
#!/usr/bin/env python
import sys
def main ():
digrams = []
words = open ("/usr/share/dict/words")
for word in words:
digram = word.rstrip().lower()[0:2]
digrams.append (digram)
count = 10
if sys.argv[1] == "-e":
d = countEx (digrams, count)
else:
d = countHas (digrams, count)
digrams = d.keys ()
digrams.sort ()
for digram in digrams:
print "%s: %d" % (digram, d [digram])
def countEx (digrams, count):
d = {}
for i in range (count):
for digram in digrams:
try:
d[digram] += 1
except KeyError:
d[digram] = 1
return d
def countHas (digrams, count):
d = {}
for i in range (count):
for digram in digrams:
if d.has_key (digram):
d[digram] += 1
else:
d[digram] = 1
return d
if __name__ == "__main__":
main ()
-----------------------------------