Sometimes for testing purposes you need to fill a database with randomly generated user names. Or maybe you're just offering distinguishable anonymity to users for whatever reason. Or maybe your product needs a codename! This describes a very simple way to get a bunch of "names".
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
import random names_file = file('/etc/dictionaries-common/words') num_dict_lines = 9900 # A-Z, no apostrophes, approximate! bytes = num_dict_lines * 10 * 8 # lines * avg word len * bytes/char rand_words = [ln for ln in names_file.readlines(bytes) if "'" not in ln] names_file.close() def gen_name(): idx = random.randint(2, num_dict_lines) username = rand_words[idx] #print 'last:', rand_words[num_dict_lines] return username.strip() # Generate a few samples. for i in range(3): print gen_name(), # Printed: Sister Frankfort Babbitt
The simple reason this works is that a dictionary file has capitalized proper names listed at the top of the file. This script simply grabs the top-most lines and assumes that they're names (which is generally true, but you'll see exceptions).
Assumptions and limitations:
- you want it to be fast and local and don't care much about relevance/accuracy of names
- hard-coded for Ubuntu word dictionary (adjust for location of yours)
- your word dictionary may be shorter or longer (but it's of little consequence)
- not all generated words are proper names
- you could be more accurate by slurping the whole file and grabbing only capitalized words, but it would be slower