Welcome, guest | Sign In | My Account | Store | Cart

Two methods to return the intersection/union of sets of data, where the form of the data is not a limiting factor.

Python, 82 lines
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82``` ```def intersection(set1, set2, *args): """ Returns intersection of tuples/lists of data, where 1) the number of sets is greater than 1, 2) the dimensionality of the sets is not pre-defined, s1 = [(1,), 'e', [[1, 7], (4L, 3j, ('', None))], ([2], {'a': 'b'})] 3) the sets are not required to share common dimensionality, s2 = (1, 2, 3, None) 4) the returned object is a one-dimensional list of objects which are neither TupleType nor ListType and has no duplicates. intersection(s1, s2) == [1, 2, None] """ result = [] sets = [] sets_append = sets.append result_append = result.append sets_append(union(set1)) sets_append(union(set2)) for arg in args: sets_append(union(arg)) for obj in sets[0]: for i in range(1, len(sets), 1): hit = obj in sets[i] if not hit: break if hit: result_append(obj) return compact(result) def union(*args): """ Returns union of tuples/lists of data, where 1) the dimensionality of the sets is not pre-defined, s1 = [(1,), 'e', [[1, 7], (4L, 3j, ('', None))], ([2], {'a': 'b'})] 2) the sets are not required to share common dimensionality, s2 = (1, 2, 3, None) 3) the union of one set is the set stripped of duplicates, 4) the returned object is a one-dimensional list of objects which are neither TupleType nor ListType and has no duplicates. union(s1) == [{'a': 'b'}, 'e', 3j, None, 4L, 7, 2, 1, ''] union(s1, s2) == [{'a': 'b'}, 'e', 3j, None, 4L, 7, 3, 2, 1, ''] """ result = [] sequenceSet = (type([1]), type((1,))) result_extend = result.extend result_append = result.append for arg in args: if type(arg) in sequenceSet: for obj in arg: result_extend(union(obj)) else: result_append(arg) return compact(result) def compact(sequence): """ Returns list of objects in sequence sans duplicates, where s1 = (1, 1, (1,), (1,), [1], [1], None, '', 0) compact(s1) == [[1], 1, 0, (1,), None, ''] """ result = [] dict_ = {} result_append = result.append for i in sequence: try: dict_[i] = 1 except: if i not in result: result_append(i) result.extend(dict_.keys()) return result ```

I am fairly new to programming, and these are the first methods I have put together that might be of some value herein. I have tried to make intersection and union as flexible as possible. I do understand that compact is not the fastest method for removing duplicates, but I like it because it is small, flexible, reliable, and generally reasonably fast. Any corrections and/or advise on how to improve intersection or union will be greatly appreciated. I look forward to both learning from and submitting to this site often. FMHj

 Created by FMHj . on Thu, 11 Jul 2002 (PSF)

### Required Modules

• (none specified)