Extremely simple/straightforward implementation.
Posted mainly to receive comments on alternative implementations.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | def irandomize(iterable, seed=None):
"""
Randomize the iterable.
Note: this evaluates the entire iterable to a sequence in memory
"""
if seed is None:
from random import shuffle
else:
from random import Random
shuffle = Random(seed).shuffle
result = list(iterable)
shuffle(result)
return iter(result)
if __name__ == "__main__":
print list(irandomize(range(10)))
print list(irandomize(range(10)))
print list(irandomize(range(10), seed=14))
print list(irandomize(range(10), seed=14))
|
While randomizing an iterable can be useful, this implementation is very basic.
So, does anyone have better ideas? :)
Tags: algorithms
What do you gain by this? The list() call loads everything into memory, so pretending like it's still an iterator but calling iter() seems misleading to me....
Fitting in an existing iterator-centric scheme. I agree that there is no implementation benefit to the iter() call, but it conforms to the interface where iterator modifiers are expected.
The scheme involved has a 'source' iterator, and chains together several iterator-modifying functions to receive a final iterator. A configuration controls which modifying functions are used at run-time.
I know this implementation is a problem. I'd love to hear suggestions for a better one...
more iterator-like implementation. Well, this isn't as random as random.shuffle -- items near the beginning of the original iterable will likely be near the beginning of the resulting iterable -- but at least it maintains the memory behavior of iterators.
In worst-case behavior, it could load all items from the iterable into memory at once, but on the average I believe it should only be storing about 10 at a time.
Length of unknown iterator has no upper bound. If you want to guarantee that memory limits are never exceeded, you will have to specify an upper bound somewhere.
I think the best way to tackle this is to provide a buffer-length argument to irandomize():
etc. Once the buffer is full, you choose a random element, replace it with the next element from the iterator, and yield the chosen element. "Shuffle" is never needed.