Welcome, guest | Sign In | My Account | Store | Cart

Provides a data structure for a queue of integers whose get() method returns the median element. The interface is similar to the standard Queue module, with an added method top() to retrieve the median without removing it.

Python, 63 lines
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63``` ```import Queue as queue # replace with 'import queue' if using Python 3 class MedianHeap: def __init__(self, numbers = None): self.median = None self.left = queue.PriorityQueue() # max-heap self.right = queue.PriorityQueue() # min-heap self.diff = 0 # difference in size between left and right if numbers: for n in numbers: self.put(n) def top(self): return self.median def put(self, n): if not self.median: self.median = n elif n <= self.median: self.left.put(-n) self.diff += 1 else: self.right.put(n) self.diff -= 1 if self.diff > 1: self.right.put(self.median) self.median = -self.left.get() self.diff = 0 elif self.diff < -1: self.left.put(-self.median) self.median = self.right.get() self.diff = 0 def get(self): median = self.median if self.diff > 0: self.median = -self.left.get() self.diff -= 1 elif self.diff < 0: self.median = self.right.get() self.diff += 1 elif not self.left.empty(): self.median = -self.left.get() self.diff -= 1 elif not self.right.empty(): self.median = self.right.get() self.diff += 1 else: # median was the only element self.median = None return median if __name__ == '__main__': import sys if len(sys.argv) > 1: numbers = map(int, sys.argv[1:]) m = MedianHeap(numbers) print "Median is ", m.get() ```

This is what I came up with, however there is another (I guess similar) method called a min-max-median heap which might be more efficient. In my implementation, the 'root' is the median of the set, while the left and right halves are stored respectively in a max-heap and min-heap (using Queue.PriorityQueue and a dirty trick on the sign of integers). Retrieving the median is constant-time, inserting a new element and removing the median both have logarithmic cost. Andy Barbour 12 years, 1 month ago

Works fine for integers:

``````% Medians.py 1 2 3 4 5
Median is  3
``````

But why is there an exception raised for floats?

``````% Medians.py 1.1 2.2 3.3 4.4 5.5
Traceback (most recent call last):
File "Medians.py", line 77, in <module>
numbers = map(int, sys.argv[1:])
ValueError: invalid literal for int() with base 10: '1.1'
`````` jimmy2times (author) 12 years, 1 month ago

It's because I'm parsing the command-line arguments as integers (look at the error code). You can replace "int" with "float" in line 61 and it should work fine. Ankit Jaiswal 10 years, 2 months ago

I need to add the remove operation as well to this program like put. Any pointers how that can be accomplished? Created by jimmy2times on Sun, 17 Apr 2011 (MIT)