| Store | Cart

[TCLCORE] Upgrade of Tcl's String Hash Algorithm

From: Donal K. Fellows <dona...@manchester.ac.uk>
Sun, 07 Feb 2010 09:36:17 +0000
Attachments: donal_k_fellows.vcf
Hi everyone

This is just a short note to let everyone know that I've just upgraded
Tcl 8.6's string (and Tcl_Obj) hash algorithm. Specifically, I've
changed from using JO's old algorithm that was approximately[*]:

   hash = ∑ str[i] * 9**i

to the FNV algorithm <URL:http://isthe.com/chongo/tech/comp/fnv/> which
is similar in many ways but uses a large prime as the power (instead of
9) and XOR as the combiner instead of addition. The result is a bit
faster when I test it, and a little bit harder for code to push into
situations where it performs badly. I don't know whether this is from
faster computation or better distribution of the resulting hash values.

How much better is the new function? I'm sorry to say that the state of
research in this area is remarkably poor; very few people are looking at
what makes for a good *non-cryptographic* string hashing function. About
the only thing that seems to be known for sure is that the power needs
to be odd so that low order bits don't get lost. But the FNV algorithm
seems to be the most thoroughly understood, and there was a Public
Domain implementation (which helped). The big difference is that the
amount of mixing of bits is much larger than before.

The main reason for this note is that it affects anyone who had code
that assumed anything about hash iteration order. If you've got that
assumption in your code, it's now been broken. This is more likely to
impact on tests rather than production code - production code is more
likely to have to deal with an expansion of the hash table which changes
the order anyway, and so won't make the assumption - and the fix is to
sort the values extracted from the hash before testing for equality. :-)

Donal.
[* The equation is true if you index from the end of the string... ]

begin:vcard
fn:Donal K. Fellows
n:Fellows;Donal
org:The University of Manchester;Research Computing Services
adr:;;Oxford Road;Manchester;;M13 9PL;United Kingdom
email;internet:dona...@manchester.ac.uk
version:2.1
end:vcard

------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
Tcl-Core mailing list
Tcl-...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tcl-core

Recent Messages in this Thread
Donal K. Fellows Feb 07, 2010 09:36 am
Eric Melski Feb 08, 2010 06:40 am
Donal K. Fellows Feb 08, 2010 11:21 am
Eric Melski Feb 08, 2010 06:45 pm
Donal K. Fellows Feb 09, 2010 01:47 am
Larry McVoy Feb 09, 2010 02:42 am
Colin McCormack Feb 09, 2010 01:05 pm
Larry McVoy Feb 09, 2010 03:48 pm
Colin McCormack Feb 09, 2010 11:14 pm
Gustaf Neumann Feb 09, 2010 04:15 pm
Donal K. Fellows Feb 10, 2010 12:38 am
Lars Hellström Feb 14, 2010 10:56 pm
Larry McVoy Feb 15, 2010 03:20 am
Donal K. Fellows Feb 15, 2010 10:02 pm
Tom Jackson Feb 15, 2010 08:55 pm
Lars Hellstrm Feb 15, 2010 11:53 pm
Donal K. Fellows Feb 16, 2010 11:56 am
Colin McCormack Feb 16, 2010 01:01 pm
Eric Melski Feb 09, 2010 11:35 pm
Alexandre Ferrieux Feb 11, 2010 12:08 am
Alexandre Ferrieux Feb 12, 2010 04:57 pm
Gustaf Neumann Feb 14, 2010 04:58 pm
Donal K. Fellows Feb 14, 2010 11:54 pm
Gustaf Neumann Feb 16, 2010 03:25 pm
Donal K. Fellows Feb 16, 2010 04:45 pm
Gustaf Neumann Feb 17, 2010 11:04 am
Kristoffer Lawson Feb 17, 2010 11:18 am
Tom Jackson Feb 16, 2010 06:30 pm
Donal K. Fellows Feb 16, 2010 10:44 pm
Gustaf Neumann Feb 17, 2010 11:49 am
Donal K. Fellows Feb 16, 2010 10:36 pm
John Ousterhout Feb 15, 2010 04:56 am
Donal K. Fellows Feb 15, 2010 03:14 pm
Colin McCormack Feb 16, 2010 01:16 pm
Larry McVoy Feb 15, 2010 04:16 pm
Tom Jackson Feb 18, 2010 09:31 pm
Tom Jackson Feb 17, 2010 06:41 pm
Donal K. Fellows Feb 18, 2010 09:23 am
Gustaf Neumann Feb 18, 2010 12:23 pm
Tom Jackson Feb 17, 2010 01:36 am
Donal K. Fellows Feb 17, 2010 11:56 am
Tom Jackson Feb 16, 2010 09:30 am
Joe English Feb 08, 2010 07:56 pm
Donal K. Fellows Feb 08, 2010 09:56 pm
Messages in this thread