Python has a powerful suite of tools for comparing lists by way of sets and frozensets. Here are a few examples and conveniences that many newcomers, even a few seasoned developers, are unaware.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 | #!/usr/bin/env python
""" Convenience methods for list comparison & manipulation
Fast and useful, set/frozenset* only retain unique values,
duplicates are automatically removed.
lr_union union
merge values, remove duplicates
lr_diff difference
left elements, subtracting any in common with right
lr_intr intersection
only common values found in both left and right
lr_symm symmetric_difference
omit values found in both left and right
lr_cont issuperset
test left contains all values from right
* Unlike set, frozenset preserves its own order and is
immutable. They do not preserve the source-order.
"""
lr_union = lambda l, r: list(set(l).union(r))
lr_diff = lambda l, r: list(set(l).difference(r))
lr_intr = lambda l, r: list(set(l).intersection(r))
lr_symm = lambda l, r: list(set(l).symmetric_difference(r))
lr_cont = lambda l, r: set(l).issuperset(r)
# silent example of lr_intr if None is passed instead of list
lrq_intr = lambda l, r: list(set(l).intersection(r or []))
# ------------ NOTHING BELOW HERE IS REQUIRED --------------
def tests():
""" doctest tests/examples for set and set conveniences
A few examples without the conveniences above.
Strings are a form of list, they can be passed where apropriate
>>> set('aabbcc') # only unique are returned
set(['a', 'c', 'b'])
Do the work and cast as list (switch to tuple if prefered)
>>> list(set('aabbcc'))
['a', 'c', 'b']
Using list does not remove duplicates
>>> list('aabbcc') # list is not unique
['a', 'a', 'b', 'b', 'c', 'c']
Simple join of lists, note the redundant values
>>> ['a', 'a', 'b'] + ['b', 'c', 'c']
['a', 'a', 'b', 'b', 'c', 'c']
Join both lists, return only unique values, join list before set (slower)
>>> list(set(['a', 'a', 'b'] + ['b', 'c', 'c']))
['a', 'c', 'b']
Join lists, as above, using built-in set library (faster)
>>> lr_union(['a', 'a', 'b'], ['b', 'c', 'c'])
['a', 'c', 'b']
Remove right values from left
>>> lr_diff(['a', 'b'], ['b', 'c'])
['a']
Remove as above, swapped/reordered inputs to remove left from right
>>> lr_diff(['b', 'c'], ['a', 'b'])
['c']
Common elements
>>> lr_intr(['a', 'b'], ['b', 'c'])
['b']
Unique elements (remove the common, intersecting, values)
Note: similar to left-right + right-left.
>>> lr_symm(['a', 'b'], ['b', 'c'])
['a', 'c']
Is left a superset of (does it contain) the right
>>> lr_cont(['a', 'b'], ['b', 'c'])
False
>>> lr_cont(['a', 'b', 'c'], ['b', 'c'])
True
Marginally less trite examples using words
>>> lwords = 'the quick brown fox'.split()
>>> rtags = 'brown,fox,jumps,over'.split(',')
Return all unique words from both lists.
>>> lr_union(lwords,rtags)
['brown', 'over', 'fox', 'quick', 'the', 'jumps']
Return unique common, intersecting, words. Members of left AND right only.
>>> lr_intr(lwords,rtags)
['brown', 'fox']
Return unique uncommon words. Members of left OR right
>>> lr_symm(lwords,rtags)
['quick', 'the', 'jumps', 'over']
Note: intersection + symmetric = union, but don't count on their order!
"""
def insecure_demo():
"""Compact method to demo functionality"""
left, right = list('aab'), list('bcc')
both = left + right
both.sort()
lamb_dict = {'Difference (Remainder of subtract: left - right)': 'lr_diff',
'Intersection (Only in left AND right)': 'lr_intr',
'Symmetric Difference (Only in left OR right)': 'lr_symm',
'Union (Unique list of ALL values)': 'lr_union'}
print "Demo methods for comparing lists using set/frozenset\n"
print "'left' list: %s" % repr(left)
print "'right' list: %s" % repr(right)
print "'both' lists: %s\n" % repr(both)
print '-' * 30
for lamb_desc, lamb_name in lamb_dict.items():
lamb_func = globals().get(lamb_name)
resp = lamb_func(left, right)
resp.sort()
print lamb_desc #'Obtain the %s of two lists' % lamb_name.split('_')[-1]
print '>>> %s(%r, %r)' % (lamb_name, left, right)
print resp
print '-' * 30
if __name__ == '__main__':
import doctest
doctest.testmod()
insecure_demo()
URL = 'http://docs.python.org/2/library/stdtypes.html#set-types-set-frozenset'
print "\nBe sure to visit:\n", URL
|
A few 'convenience' lambdas for list-'fu'
These are tasks I have seen handled, repeatedly, as roll-your-own. Life is too short for doing that when good solutions are so close at hand.
Ultimately, if you have been, or are, contemplating writing list-parsing routines to compare, find or eliminate common elements, reduce redundant value-elements, etc., then you'll like sets and frozensets.
All the gory details are in doc-strings and comments, so I won't repeat them here.
The doctests are silent when there are no errors, so a crappy demo-method performs and prints results from these functions.
Since this is likely of more use to newcomers, I have included some similar list functionality to highlight differences and usefulness.
The inclusion of anonymous functions as "wrapper-code" using lamba is not essential or even recommended for a majority of use-cases. Still, they can be useful especially if renamed to clarify their purpose in whatever context they are deployed.
So copy/paste useful parts into your script(s), or re-write as needed. These are so useful, you'll quickly remember the syntax after a couple times. Way too trivial to merit the dependency baggage of a library.