A simple little task, how to find the intersection of two hashes. There's probably many ways to do this but there is one way to avoid...
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | # two dictionaries
# blow this up to 1000 to see the difference
some_dict = { 'zope':'zzz', 'python':'rocks' }
another_dict = { 'python':'rocks', 'perl':'$' }
# bad way
# two lots of "in"
intersect = []
for item in some_dict.keys():
if item in another_dict.keys():
intersect.append(item)
print "Intersects:", intersect
# good way
# use simple lookup with has_keys()
intersect = []
for item in some_dict.keys():
if another_dict.has_key(item):
intersect.append(item)
print "Intersects:", intersect
|
The keys method produces a list of all the keys to dict and it can be pretty simple to fall in to the trap of just using "in". However in the first example you looping through all of some_dict, then each time looping through all of another_dict.
By using the has_keys method, you are not looping another_dict, but pulling the key from the dict in the resizable hash table. The difference is understandably quite remarkable for large dictionaries.
I think the code here has a typo, it should say "has_key" instead of "has_keys"
A Python 2.2 enhancement. Note that the difference disappears in the forthcoming Python 2.2: there, "if somekey in in adict" becomes equivalent to "if adict.has_key(somekey)". The recipe should therefore be indicated as only important for Python 2.1 and earlier.
Oops. Thanks.
Alternatively...
Dictionaries of different size. It is also worthwhile to note that if one dictionary has considerably more elements, the smallest dictionary should be the one to loop over. However, the difference starts to get noticeable only with dictionaries that have thousands of keys.
Finding items not in both dictionaries. What if I want to find the items that are not in both dictionaries?
Finding intersections and non-intersections. ### Finding intersections and non-intersections
some_dict = { 'zope':'zzz', 'python':'rocks','1':'one','2':'two','3':'three','4':'four' }
another_dict = { 'python':'rocks', 'perl':'$','2':'two','3':'three' }
intersection = dict([(item,some_dict[item]) for item in some_dict.keys() if another_dict.has_key(item)])
non_intersection = dict([(item,some_dict[item]) for item in some_dict.keys() if not another_dict.has_key(item)])
print "a new dict containing the intersection:",intersection
print "a new dict containing the non-intersection:",non_intersection
I do not know what a non-intersection is through education but I expected a non-intersection is:
keys in some_dict but not in another_dict AND
keys in another_dict but not in some_dict.
s g's recipe:
Testing this code in python 2.5 and Pythonwin with "print non_interection" shows that another_dict['perl'] was not appended to non_intersection even though it is not in some_dict.
You are correct. In order to complete the non-intersection, you must reverse the dictionaries in the same command:
and then merge them together:
I think in recent versions this might be easier. (Tested with Python 3.3.0)
If you only want to intersect keys/values use keys() or values() instead (without the dict()). (Sorry if this is old, but google still lists it on first place)