Welcome, guest | Sign In | My Account | Store | Cart

New take on merging. Uses Python's "timsort" to merge in O(n) time.

Python, 8 lines
1
2
3
4
5
6
7
8
def merge(s1, s2, *args, **kwds):
    "Merge two sorted sequences."
    s3 = list(s1) + list(s2)
    s3.sort(*args, **kwds)
    return s3

if __name__ == "__main__":
    assert merge(xrange(0,10,2), xrange(1,10,2)) == range(10)

The usual approach to merging is to loop through both sequences taking the smallest from each until they are both exhausted.

Python's "timsort" detects order in underlying sequences and will run a C speed merge on the data. So, all that is involved is concatenating the sequences and running a sort.

Do not use this approach with other sorts. Most will not detect the underlying order and will run in O(n log n) time.

This approach has other advantages too. This function will detect and correct input sequences that are out of order. Also, Python 2.4's sorted() takes arguments for "reversed" and "key" for customing the sort order and for encapulating the decorate-sort-undecorate pattern. Building these features into the usual merge algorithm takes quite a bit of code and does not run as quickly.

1 comment

Vjacheslav Fyodorov 19 years, 4 months ago  # | flag

This is good for inexpencive comparison. Sorting must check already sorted parts in initial path, then make a merging between them. Seems, just need to expose merging stage of sorting algorithm to users in builtin function or lists method.

Created by Raymond Hettinger on Mon, 13 Sep 2004 (PSF)
Python recipes (4591)
Raymond Hettinger's recipes (97)

Required Modules

  • (none specified)

Other Information and Tasks