Welcome, guest | Sign In | My Account | Store | Cart

The simplest, direct way of consuming a list in a random fashion is painfully slow for list with a few 100's of elements. There are equally simple, but much faster ways to do this.

Python, 24 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from random import *

data = range(10000)

# the original, naive version
def select(data):
	if data != []:
		elem = choice(data)
		data.remove(elem)
		return elem
	else:
		return None


# "final" version 
def select(data):
	if data != []:
		pos = randrange( len(data) )
		elem = data[pos]
		data[pos] = data[-1]
		del data[-1]
		return elem
	else:
		return None

The idea is to select a random element, but instead of deleting it (expensively copying the rest of the list frontwards), replacing it with the last element of the list (and deleting it later, which is cheap) As pointed by others, there are several ways to implement this idea, e.g., using it with a shallow copied list (in case you'll want to use all elements but also avoid deletions), or "sorting" the original list with random.shuffle(), but I prefer this simpler, more general approach.

8 comments

Duncan Grisby 20 years, 6 months ago  # | flag

A quicker way to do it... Even the "faster" version is very inefficient since deleting an item from the middle of a list has to move all of the trailing items one place up. It is still O(n**2).

Here is an O(n) algorithm. After it picks the element to remove, it replaces it with the element from the end of the list, avoiding the costly copy of all the trailing elements.

size = len(data)
while size:
    size = size - 1
    index = whrandom.randint(0, size)
    elem = data[index]
    data[index] = data[size]
    print elem
Iuri Wickert (author) 20 years, 6 months ago  # | flag

Re: A quicker way ... Very clever! Thanks for the enlightenment!

Iuri Wickert (author) 20 years, 6 months ago  # | flag

Adding up... Incorporing the helpful comment of Dr. Grisby (above), a pythonic function to select a random element of a list, consuming it, would look like this:

import whrandom

def select(data):
    if data != []:
        index = whrandom.randint(0, len(data) - 1)
        elem = data[index]
        data[index] = data[-1]
        del data[-1]
        return elem
    else:
        return data

This way, no searching and no trailing-element copying are made, but the caller can still rely on the (shrinking) referenced list size.

Alex Martelli 20 years, 5 months ago  # | flag

Use random, not whrandom... The whrandom module is an "internal implementation detail". Module random should be used for most RNG tasks.

Steve Holden 20 years, 4 months ago  # | flag

If duplicating the list isn't a problem ... ... and all list elements are going to be consumed, it would perhaps be faster to make a shallow copy using copy.copy and perform len(list) random pair interchanges. You can then simply iterate over the re-ordered copy using for and delete the copy.

This avoids a lot of unpleasant memory allocation and deallocation.

T Warner 20 years, 4 months ago  # | flag

Even faster... Pull the whrandom lookup out of the loop with a:

from whrandom import randint

Then use randint alone in the code. You'll get a nice performance boost!

Also the suggestion to use random instead of whrandom is probably correct especially since whrandom is being phased out. randint is also deprecated in favor of randomrange, though that doesn't work with older implementations of python.

--Todd Warner

Terry Reedy 20 years, 4 months ago  # | flag

So change it already! So replace your original poor algorithm with this clearly much better one, which shuffles the list as it goes. Some newbie might not read this far.

Alex Martelli 20 years, 1 month ago  # | flag

or even better... just random.shuffle will do it!

Created by Iuri Wickert on Sat, 2 Jun 2001 (PSF)
Python recipes (4591)
Iuri Wickert's recipes (2)

Required Modules

Other Information and Tasks