Welcome, guest | Sign In | My Account | Store | Cart
1

This algorithm takes as input a function for computing matrix values, and searches for the position of maximum value in each row. The matrix must satisfy the "totally monotone" property: in each submatrix (in particular each 2x2 submatrix) the positions of the maxima must move leftward as you go down the rows. The algorithm uses this property to greatly reduce the number of matrix elements evaluated, compared to a naive algorithm that explicitly constructs the matrix.

As a simple example, we apply the algorithm to finding nearest neighbors in B for each point in A, where B may be distributed arbitrarily in space but the points of A lie along a single line. Using SMAWK for this problem takes only linear time if the input is already sorted.

Python, 75 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
# SMAWK totally monotone matrix searching algorithm
# David Eppstein, UC Irvine, 17 Mar 2002

# reference:
# A. Aggarwal, M. Klawe, S. Moran, P. Shor and R. Wilber,
# Geometric applications of a matrix searching algorithm,
# Algorithmica 2, pp. 195-208 (1987).

def smawk(rows,cols,lookup):
    '''Search for row-maxima in a 2d totally monotone matrix M[i,j].
    The input is specified by a list of row indices, a list of column
    indices, and a function "lookup" satisfying lookup(i,j) = M[i,j].
    The matrix must satisfy the totally monotone ordering property:
    if i occurs before i' in rows, j occurs before j' in cols, and
    M[i,j] < M[i,j'], then also M[i',j] < M[i',j'].  The result is
    returned as a dictionary mapping row i to the column j containing
    the largest value M[i,j].  Ties are broken in favor of earlier
    columns.  The number of calls to lookup is O(len(rows)+len(cols)).'''
	
    # base case of recursion
    if not rows: return {}
	
    # reduce phase: make number of columns at most equal to number of rows
    stack = []
    for c in cols:
        while len(stack) >= 1 and \
          lookup(rows[len(stack)-1],stack[-1]) < lookup(rows[len(stack)-1],c):
            stack.pop()
        if len(stack) != len(rows):
            stack.append(c)

    cols = stack

    # recursive call to search for every odd row
    result = smawk([rows[i] for i in xrange(1,len(rows),2)],cols,lookup)

    # go back and fill in the even rows
    c = 0
    for r in xrange(0,len(rows),2):
        row = rows[r]
        if r == len(rows) - 1:
            cc = len(cols)-1  # if r is last row, search through last col
        else:
            cc = c            # otherwise only until pos of max in row r+1
            target = result[rows[r+1]]
            while cols[cc] != target:
                cc += 1
        result[row] = max([ (lookup(row,cols[x]),-x,cols[x]) \
                            for x in xrange(c,cc+1) ]) [2]
        c = cc

    return result


# Example application: All nearest neighbors of points on a line
# Exercise: verify that negDist satisfies the totally monotone matrix property.

def linearNearestNeighbors(A,B):
    '''Takes as input a set of points A and B in some Euclidean space.
    A lies on the x-axis and is given as a sequence of real numbers;
    B is given in Cartesian coordinates (tuples of real numbers).
    Output is a dict mapping each point in A to the nearest point in B.'''

    # negate squared Euclidean distance to convert minimization of distance
    # into a maximization problem suitable for SMAWK algorithm
    def square(x): return x*x
    def negDist(a,b):
        dist = square(b[0]-a)
        for i in range(1,len(b)):
            dist += square(b[i])
        return -dist
	
    A.sort()	# make sure points are sorted, omit if already sorted
    B.sort()
    return smawk(A,B,negDist)

See http://citeseer.nj.nec.com/context/164110/0 for research on algorithms based on the SMAWK technique. Totally monotone matrix searching problems come up in computational geometry (nearest neighbors as shown here, all farthest neighbors on a convex polygon, largest empty rectangle, smallest enclosing triangle, finding bridge connecting two convex sets to minimize total diameter); graph theory (various problems involving shortest paths in planar graphs); chemistry (center and labeling problems in benzenoid systems); data compression (Huffman coding with unequal letter costs); computational biology (sequence alignment, RNA secondary structure); and text processing (breaking paragraphs into lines).

Python's built-in dictionary and list data structures make it easy to perform the recursion at the heart of the SMAWK algorithm, passing down subsequences of the row indices and passing back up mappings from row indices to column indices. In addition (as shown in the nearest-neighbor example) there is no requirement for the matrix index sets to be contiguous ranges of integers.

Add a comment

Sign in to comment

Created by David Eppstein on Sun, 17 Mar 2002 (PSF)
Python recipes (4591)
David Eppstein's recipes (14)

Required Modules

  • (none specified)

Other Information and Tasks