The simplest, direct way of consuming a list in a random fashion is painfully slow for list with a few 100's of elements. There are equally simple, but much faster ways to do this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | from random import *
data = range(10000)
# the original, naive version
def select(data):
if data != []:
elem = choice(data)
data.remove(elem)
return elem
else:
return None
# "final" version
def select(data):
if data != []:
pos = randrange( len(data) )
elem = data[pos]
data[pos] = data[-1]
del data[-1]
return elem
else:
return None
|
The idea is to select a random element, but instead of deleting it (expensively copying the rest of the list frontwards), replacing it with the last element of the list (and deleting it later, which is cheap) As pointed by others, there are several ways to implement this idea, e.g., using it with a shallow copied list (in case you'll want to use all elements but also avoid deletions), or "sorting" the original list with random.shuffle(), but I prefer this simpler, more general approach.
A quicker way to do it... Even the "faster" version is very inefficient since deleting an item from the middle of a list has to move all of the trailing items one place up. It is still O(n**2).
Here is an O(n) algorithm. After it picks the element to remove, it replaces it with the element from the end of the list, avoiding the costly copy of all the trailing elements.
Re: A quicker way ... Very clever! Thanks for the enlightenment!
Adding up... Incorporing the helpful comment of Dr. Grisby (above), a pythonic function to select a random element of a list, consuming it, would look like this:
This way, no searching and no trailing-element copying are made, but the caller can still rely on the (shrinking) referenced list size.
Use random, not whrandom... The whrandom module is an "internal implementation detail". Module random should be used for most RNG tasks.
If duplicating the list isn't a problem ... ... and all list elements are going to be consumed, it would perhaps be faster to make a shallow copy using copy.copy and perform len(list) random pair interchanges. You can then simply iterate over the re-ordered copy using for and delete the copy.
This avoids a lot of unpleasant memory allocation and deallocation.
Even faster... Pull the whrandom lookup out of the loop with a:
Then use randint alone in the code. You'll get a nice performance boost!
Also the suggestion to use random instead of whrandom is probably correct especially since whrandom is being phased out. randint is also deprecated in favor of randomrange, though that doesn't work with older implementations of python.
--Todd Warner
So change it already! So replace your original poor algorithm with this clearly much better one, which shuffles the list as it goes. Some newbie might not read this far.
or even better... just random.shuffle will do it!