ActiveState Code

Recipe 305269: Merge sorted sequences


New take on merging. Uses Python's "timsort" to merge in O(n) time.

Python
1
2
3
4
5
6
7
8
def merge(s1, s2, *args, **kwds):
    "Merge two sorted sequences."
    s3 = list(s1) + list(s2)
    s3.sort(*args, **kwds)
    return s3

if __name__ == "__main__":
    assert merge(xrange(0,10,2), xrange(1,10,2)) == range(10)

Discussion

The usual approach to merging is to loop through both sequences taking the smallest from each until they are both exhausted.

Python's "timsort" detects order in underlying sequences and will run a C speed merge on the data. So, all that is involved is concatenating the sequences and running a sort.

Do not use this approach with other sorts. Most will not detect the underlying order and will run in O(n log n) time.

This approach has other advantages too. This function will detect and correct input sequences that are out of order. Also, Python 2.4's sorted() takes arguments for "reversed" and "key" for customing the sort order and for encapulating the decorate-sort-undecorate pattern. Building these features into the usual merge algorithm takes quite a bit of code and does not run as quickly.

Comments

  1. 1. At 10:12 p.m. on 29 nov 2004, Vjacheslav Fyodorov said:

    This is good for inexpencive comparison. Sorting must check already sorted parts in initial path, then make a merging between them. Seems, just need to expose merging stage of sorting algorithm to users in builtin function or lists method.

Sign in to comment