Welcome, guest | Sign In | My Account | Store | Cart

Sorting strings whith embeded numbers.

Python, 29 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import re

DIGITS = re.compile(r'[0-9]+')
def compnum(x, y):
    nx = ny = 0
    while True:
        a = DIGITS.search(x, nx)
        b = DIGITS.search(y, ny)
        if None in (a,b):
            return cmp(x[nx:], y[ny:])
        r = (cmp(x[nx:a.start()], y[ny:b.start()]) or
             cmp(int(x[a.start():a.end()]), int(y[b.start():b.end()])))
        if r:
            return r
        nx, ny = a.end(), b.end()


#
#  sample
#

L1 = ["file~%d.txt"%i for i in range(1,15)]
L2 = L1[:]

L1.sort()
L2.sort(compnum)

for i,j in zip(L1, L2):
    print "%15s %15s" % (i,j)

When you want to print a list of strings which contain numbers, you will probably not find the built-in list.sort behaviour really satisfying. Because is is a lexicographical sort, you will end with 'string10' between 'string1' and 'string2'. The compun function can be used as list.sort argument to get a more sensible result.