| Store | Cart

Re: [TCLCORE] Upgrade of Tcl's String Hash Algorithm

From: Gustaf Neumann <neum...@wu-wien.ac.at>
Thu, 18 Feb 2010 13:23:47 +0100
Am 17.02.10 19:41, schrieb Tom Jackson:
> The keyspace is only 32 bits, this is trivial by any measure. I guess> Gustav has already found several sets of keys. Of course, the real> problem is that you only need the low order bits of the hash to match,> so the keyspace for the buckets is much smaller than 32 bits.>    
Exactly.  It is sufficient to attack a much smaller keyspace to achieve
the reported slowdowns. To make a lookup in a table with 55000
entries a linear search, it is sufficient to find hash-values with the
identical 14 low bits (in the current mapping from hash-key to bucket).
One can construct such key-sets for every hash function in a c
program in less than 20 seconds (finds e.g.  70000 keys with the
same hash suffix as "hello" based on fnv).

As pointed out earlier, this attack is not possible when using a randomized
seed and hash functions like murmur or lookp3, which mix the seed properly
into the hash value, since on every start of tcl, it will return 
sufficiently
different hash values for the same keys.

Since tcl support custom hash tables, every c-level application can
provide for their own hash tables a different hash function with and 
without
seeds.

In case, there is still someone listening:
Below are some results from loading dict/words into
an array (the script donal posted) with 4 hash algorithms.
It shows, that Tcl's classic hash function does very well,
murmur is the seconds and the perl hash function is the
worst of these.

-gustaf neumann

   CLASSIC
   218022.7906 microseconds per iteration

   MURMUR
   219907.73859999998 microseconds per iteration

   LOOKUP3
   225928.0354 microseconds per iteration

   PERL_HASH
   249029.0566 microseconds per iteration


------------------------------------------------------------------------------
Download Intel&reg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Tcl-Core mailing list
Tcl-...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tcl-core

Recent Messages in this Thread
Donal K. Fellows Feb 07, 2010 09:36 am
Eric Melski Feb 08, 2010 06:40 am
Donal K. Fellows Feb 08, 2010 11:21 am
Eric Melski Feb 08, 2010 06:45 pm
Donal K. Fellows Feb 09, 2010 01:47 am
Larry McVoy Feb 09, 2010 02:42 am
Colin McCormack Feb 09, 2010 01:05 pm
Larry McVoy Feb 09, 2010 03:48 pm
Colin McCormack Feb 09, 2010 11:14 pm
Gustaf Neumann Feb 09, 2010 04:15 pm
Donal K. Fellows Feb 10, 2010 12:38 am
Lars Hellström Feb 14, 2010 10:56 pm
Larry McVoy Feb 15, 2010 03:20 am
Donal K. Fellows Feb 15, 2010 10:02 pm
Tom Jackson Feb 15, 2010 08:55 pm
Lars Hellström Feb 15, 2010 11:53 pm
Donal K. Fellows Feb 16, 2010 11:56 am
Colin McCormack Feb 16, 2010 01:01 pm
Eric Melski Feb 09, 2010 11:35 pm
Alexandre Ferrieux Feb 11, 2010 12:08 am
Alexandre Ferrieux Feb 12, 2010 04:57 pm
Gustaf Neumann Feb 14, 2010 04:58 pm
Donal K. Fellows Feb 14, 2010 11:54 pm
Gustaf Neumann Feb 16, 2010 03:25 pm
Donal K. Fellows Feb 16, 2010 04:45 pm
Gustaf Neumann Feb 17, 2010 11:04 am
Kristoffer Lawson Feb 17, 2010 11:18 am
Tom Jackson Feb 16, 2010 06:30 pm
Donal K. Fellows Feb 16, 2010 10:44 pm
Gustaf Neumann Feb 17, 2010 11:49 am
Donal K. Fellows Feb 16, 2010 10:36 pm
John Ousterhout Feb 15, 2010 04:56 am
Donal K. Fellows Feb 15, 2010 03:14 pm
Colin McCormack Feb 16, 2010 01:16 pm
Larry McVoy Feb 15, 2010 04:16 pm
Tom Jackson Feb 18, 2010 09:31 pm
Tom Jackson Feb 17, 2010 06:41 pm
Donal K. Fellows Feb 18, 2010 09:23 am
Gustaf Neumann Feb 18, 2010 12:23 pm
Tom Jackson Feb 17, 2010 01:36 am
Donal K. Fellows Feb 17, 2010 11:56 am
Tom Jackson Feb 16, 2010 09:30 am
Joe English Feb 08, 2010 07:56 pm
Donal K. Fellows Feb 08, 2010 09:56 pm
Messages in this thread