Welcome, guest | Sign In | My Account | Store | Cart

Method for finding non-dominated options in multi-dimensional data.

Python, 20 lines
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20``` ```import numpy as np def pareto_frontier_multi(myArray): # Sort on first dimension myArray = myArray[myArray[:,0].argsort()] # Add first row to pareto_frontier pareto_frontier = myArray[0:1,:] # Test next row against the last row in pareto_frontier for row in myArray[1:,:]: if sum([row[x] >= pareto_frontier[-1][x] for x in range(len(row))]) == len(row): # If it is better on all features add the row to pareto_frontier pareto_frontier = np.concatenate((pareto_frontier, [row])) return pareto_frontier def test() myArray = np.array([[1,1,1],[2,2,2],[4,4,4],[3,3,3]]) print pareto_frontier_multi(myArray) test() ```

This requires that your Pareto frontier is looking for maximum values on each dimension (e.g. bigger is better) and so it is important that you pass a numpy array where your features have been recoded to suit this requirement.

If you have a two dimensional data set, see my other Pareto recipe which allows you to choose whether you want to find maximum or minimum values for each dimension.

#### 1 comment JvP 7 years, 4 months ago

The algorithm above does not work for me; it has a running time of O(N), while the fastest known running time is O(N log d). Simple Cull runs in O(N^2).

This one works for partial orderings, given that you can define the `dominates` method for checking if one point dominates another (based on the Simple Cull algorithm described in this paper).

``````def simple_cull(inputPoints, dominates):
paretoPoints = set()
candidateRowNr = 0
dominatedPoints = set()
while True:
candidateRow = inputPoints[candidateRowNr]
inputPoints.remove(candidateRow)
rowNr = 0
nonDominated = True
while len(inputPoints) != 0 and rowNr < len(inputPoints):
row = inputPoints[rowNr]
if dominates(candidateRow, row):
# If it is worse on all features remove the row from the array
inputPoints.remove(row)
elif dominates(row, candidateRow):
nonDominated = False
rowNr += 1
else:
rowNr += 1

if nonDominated:
# add the non-dominated point to the Pareto frontier

if len(inputPoints) == 0:
break
return paretoPoints, dominatedPoints
``````

Example usage:

``````def dominates(row, candidateRow):
return sum([row[x] >= candidateRow[x] for x in range(len(row))]) == len(row)

inputPoints = [[1,1,1], [1,2,3], [3,2,1], [4,1,1]]
paretoPoints, dominatedPoints = simple_cull(inputPoints, dominates)

print "*"*8 + " non-dominated answers " + ("*"*8)
for p in paretoPoints:
print p
print "*"*8 + " dominated answers " + ("*"*8)
for p in dominatedPoints:
print p
`````` Created by Jamie Bull on Sat, 13 Oct 2012 (MIT)