Welcome, guest | Sign In | My Account | Store | Cart

Tim Peter's recipe (52560) and bearophile's version (438599) seem a bit too complex. There are speed an sorting issues with each. Not to mention that neither keeps the data type of the input object. Here is my take on a python unique() function for enumerables (list, tuple, str).

Python, 37 lines
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37``` ```def unique(inlist, keepstr=True): typ = type(inlist) if not typ == list: inlist = list(inlist) i = 0 while i < len(inlist): try: del inlist[inlist.index(inlist[i], i + 1)] except: i += 1 if not typ in (str, unicode): inlist = typ(inlist) else: if keepstr: inlist = ''.join(inlist) return inlist ## ## testing... ## assert unique( [[1], [2]] ) == [[1], [2]] assert unique( ((1,),(2,)) ) == ((1,), (2,)) assert unique( ([1,],[2,]) ) == ([1,], [2,]) assert unique( ([1+2J],[2+1J],[1+2J]) ) == ([1+2j], [2+1j]) assert unique( ([1+2J],[1+2J]) ) == ([1+2j],) assert unique( [0] * 1000 ) == [0] assert unique( [1, 2, 3, 1, 2]) == [1, 2, 3] assert unique( [3, 2, 3, 1, 2]) == [3, 2, 1] s = "iterable dict based unique" assert unique(s) == 'iterabl dcsunq' assert unique(s, False) == ['i', 't', 'e', 'r', 'a', 'b', 'l', ' ', 'd', 'c', 's', 'u', 'n', 'q'] s = unicode(s) assert unique(s, False) == [u'i', u't', u'e', u'r', u'a', u'b', u'l', u' ', u'd', u'c', u's', u'u', u'n', u'q'] assert unique(s) == u'iterabl dcsunq' # all asserts should pass! ```

This version passes all the quasi-unit tests in bearophile's recipe (albeit returning the original object type [optionally for for str type] rather than returning a list unconditionally). No use of sets or dicts is required, and ordering is preserved. I haven't tested for speed, but subjectively it seems as fast, if not faster than the other recipes. Let me know if there are problems or odd cases I haven't accounted for. As far as my own testing goes, this works very well.

NB: Only tested under 2.4 and 2.5.

Nicer version from Paul Rubin. Paul Rubin posted a better version in the python mailing list:

``````def unique(seq, keepstr=True):
t = type(seq)
if t==str:
t = (list, ''.join)[bool(keepstr)]
seen = []
return t(c for c in seq if not (c in seen or seen.append(c)))
``````

Nice!

Fixed to work with unicode string objects.

``````def unique(seq, keepstr=True):
t = type(seq)
if t in (str, unicode):
t = (list, ''.join)[bool(keepstr)]
seen = []
return t(c for c in seq if not (c in seen or seen.append(c)))
``````
Diego Novella 16 years ago

Using a set instead of a list ? What about replacing the last two lines with:

seen = set() return t(c for c in seq if not (c in seen or seen.add(c)))

? This shuld be faster.

Using set() fails first assert. Problem is that "list objects are unhashable". Paul Rubin posted a case-optimized version (using sets where possible) to the python group:

http://tinyurl.com/24zchj

Diego Novella 16 years ago

Thank You for the link, I like that solution :-)

 Created by Jordan Callicoat on Tue, 27 Feb 2007 (PSF)

### Required Modules

• (none specified)