Welcome, guest | Sign In | My Account | Store | Cart

Multidimensional Pareto front (Python recipe) by Jamie Bull
ActiveState Code (http://code.activestate.com/recipes/578287/)

Method for finding non-dominated options in multi-dimensional data.

      import numpy as np

def pareto_frontier_multi(myArray):
    # Sort on first dimension
    myArray = myArray[myArray[:,0].argsort()]
    # Add first row to pareto_frontier
    pareto_frontier = myArray[0:1,:]
    # Test next row against the last row in pareto_frontier
    for row in myArray[1:,:]:
        if sum([row[x] >= pareto_frontier[-1][x]
                for x in range(len(row))]) == len(row):
            # If it is better on all features add the row to pareto_frontier
            pareto_frontier = np.concatenate((pareto_frontier, [row]))
    return pareto_frontier

def test()
    myArray = np.array([[1,1,1],[2,2,2],[4,4,4],[3,3,3]])
    print pareto_frontier_multi(myArray)

test()

      

This requires that your Pareto frontier is looking for maximum values on each dimension (e.g. bigger is better) and so it is important that you pass a numpy array where your features have been recoded to suit this requirement.

If you have a two dimensional data set, see my other Pareto recipe which allows you to choose whether you want to find maximum or minimum values for each dimension.

More info on Pareto frontiers is available at oCo Carbon.

Tags: engineering, finance, optimal, pareto, pareto_front, pareto_frontier, pareto_set

1 comment

JvP 9 years, 8 months ago # | flag

The algorithm above does not work for me; it has a running time of O(N), while the fastest known running time is O(N log d). Simple Cull runs in O(N^2).

This one works for partial orderings, given that you can define the dominates method for checking if one point dominates another (based on the Simple Cull algorithm described in this paper).

def simple_cull(inputPoints, dominates):
    paretoPoints = set()
    candidateRowNr = 0
    dominatedPoints = set()
    while True:
        candidateRow = inputPoints[candidateRowNr]
        inputPoints.remove(candidateRow)
        rowNr = 0
        nonDominated = True
        while len(inputPoints) != 0 and rowNr < len(inputPoints):
            row = inputPoints[rowNr]
            if dominates(candidateRow, row):
                # If it is worse on all features remove the row from the array
                inputPoints.remove(row)
                dominatedPoints.add(tuple(row))
            elif dominates(row, candidateRow):
                nonDominated = False
                dominatedPoints.add(tuple(candidateRow))
                rowNr += 1
            else:
                rowNr += 1

        if nonDominated:
            # add the non-dominated point to the Pareto frontier
            paretoPoints.add(tuple(candidateRow))

        if len(inputPoints) == 0:
            break
    return paretoPoints, dominatedPoints

Example usage:

def dominates(row, candidateRow):
    return sum([row[x] >= candidateRow[x] for x in range(len(row))]) == len(row)    

inputPoints = [[1,1,1], [1,2,3], [3,2,1], [4,1,1]]
paretoPoints, dominatedPoints = simple_cull(inputPoints, dominates)

print "*"*8 + " non-dominated answers " + ("*"*8)
for p in paretoPoints:
    print p
print "*"*8 + " dominated answers " + ("*"*8)
for p in dominatedPoints:
    print p

Created by Jamie Bull on Sat, 13 Oct 2012 (MIT)

◄	Python recipes (4591)	►
◄	Jamie Bull's recipes (5)	►

Required Modules

numpy

Other Information and Tasks

Licensed under the MIT License
Viewed 24674 times
Revision 2 (updated 11 years ago)

Accounts

Code Recipes

Feedback & Information

ActiveState

© 2024 ActiveState Software Inc. All rights reserved. ActiveState®, Komodo®, ActiveState Perl Dev Kit®, ActiveState Tcl Dev Kit®, ActivePerl®, ActivePython®, and ActiveTcl® are registered trademarks of ActiveState. All other marks are property of their respective owners.

Multidimensional Pareto front (Python recipe) by Jamie Bull ActiveState Code (http://code.activestate.com/recipes/578287/)