ActiveState Code

Recipe 366178: A fast prime number list generator


This is a fast prime number list generator using sieve algorithm. This function return a list of prime numbers which <= argument.

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
def primes(n): 
	if n==2: return [2]
	elif n<2: return []
	s=range(3,n+1,2)
	mroot = n ** 0.5
	half=(n+1)/2-1
	i=0
	m=3
	while m <= mroot:
		if s[i]:
			j=(m*m-3)/2
			s[j]=0
			while j<half:
				s[j]=0
				j+=m
		i=i+1
		m=2*i+3
	return [2]+[x for x in s if x]

print primes(13)
print primes(3000)

------------- we get ------------
[2, 3, 5, 7, 11, 13]
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73,
 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163
, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251
, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349
, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443
, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557
, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631, 641, 643, 647
, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757
, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863
, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983
, 991, 997, 1009, 1013, 1019, 1021, 1031, 1033, 1039, 1049, 1051, 1061, 1063, 10
69, 1087, 1091, 1093, 1097, 1103, 1109, 1117, 1123, 1129, 1151, 1153, 1163, 1171
, 1181, 1187, 1193, 1201, 1213, 1217, 1223, 1229, 1231, 1237, 1249, 1259, 1277,
1279, 1283, 1289, 1291, 1297, 1301, 1303, 1307, 1319, 1321, 1327, 1361, 1367, 13
73, 1381, 1399, 1409, 1423, 1427, 1429, 1433, 1439, 1447, 1451, 1453, 1459, 1471
, 1481, 1483, 1487, 1489, 1493, 1499, 1511, 1523, 1531, 1543, 1549, 1553, 1559,
1567, 1571, 1579, 1583, 1597, 1601, 1607, 1609, 1613, 1619, 1621, 1627, 1637, 16
57, 1663, 1667, 1669, 1693, 1697, 1699, 1709, 1721, 1723, 1733, 1741, 1747, 1753
, 1759, 1777, 1783, 1787, 1789, 1801, 1811, 1823, 1831, 1847, 1861, 1867, 1871,
1873, 1877, 1879, 1889, 1901, 1907, 1913, 1931, 1933, 1949, 1951, 1973, 1979, 19
87, 1993, 1997, 1999, 2003, 2011, 2017, 2027, 2029, 2039, 2053, 2063, 2069, 2081
, 2083, 2087, 2089, 2099, 2111, 2113, 2129, 2131, 2137, 2141, 2143, 2153, 2161,
2179, 2203, 2207, 2213, 2221, 2237, 2239, 2243, 2251, 2267, 2269, 2273, 2281, 22
87, 2293, 2297, 2309, 2311, 2333, 2339, 2341, 2347, 2351, 2357, 2371, 2377, 2381
, 2383, 2389, 2393, 2399, 2411, 2417, 2423, 2437, 2441, 2447, 2459, 2467, 2473,
2477, 2503, 2521, 2531, 2539, 2543, 2549, 2551, 2557, 2579, 2591, 2593, 2609, 26
17, 2621, 2633, 2647, 2657, 2659, 2663, 2671, 2677, 2683, 2687, 2689, 2693, 2699
, 2707, 2711, 2713, 2719, 2729, 2731, 2741, 2749, 2753, 2767, 2777, 2789, 2791,
2797, 2801, 2803, 2819, 2833, 2837, 2843, 2851, 2857, 2861, 2879, 2887, 2897, 29
03, 2909, 2917, 2927, 2939, 2953, 2957, 2963, 2969, 2971, 2999]

Discussion

This program is almost 8 times faster than the other recipe(fast prime number list creator), and it's more than 2 time faster than Alex's following code:

sieve = range(max+1) def dosieve(sieve, max): for i in range(2, math.sqrt(max)+1): if not sieve[i]: continue for j in range(i*i, max+1, i): sieve[j] = None dosieve(sieve, max)

primes=map(str,filter(None, sieve[2:]))

one reason it's faster is that is it operate only on odd number. Another reason is n**0.5 is slightly faster than math.sqrt(n) :)

Comments

  1. 1. At 1:26 p.m. on 7 feb 2005, Wensheng Wang (the author) said:

    Oops! The second reason I gave is not correct. If the reviewer see this, please delete that second reason. Also please indent Alex Martelli's code. thanks.

  2. 2. At 8:25 p.m. on 7 feb 2005, Kazuo Moriwaka said:

    another reason. Thanks for making progress. I think another reason why your code is faster is old (my) recipe was create lists each steps, but your code just rewrite numbers to 0.

  3. 3. At 2:05 a.m. on 13 feb 2005, Pierre Quentel said:

    Improvement. Very nice recipe, thanks. A slight improvement : the function raises exceptions for some odd numbers (primes(13) for instance). It is safer if you change the value of "half" to len(s)

  4. 4. At 2:42 p.m. on 17 feb 2005, Michael Spencer said:

    Nice speed, but the indexing is a little opaque... Here's the same algorithm with simpler indexing:

    def primes2(n):
        if nHere's the same algorithm with simpler indexing:
    
    <pre>
    def primes2(n):
        if n
    

    </pre>

  5. 5. At 2:49 p.m. on 17 feb 2005, Michael Spencer said:

    How do you add code in the comments. I've tried surrounding it in pre tags. I give up...

  6. 6. At 12:26 a.m. on 22 apr 2005, Jimmy Hoeks said:

    Out by 1? You say the function returns primes that are less than the argument. Did you mean less than or equal to the argument?

    In fact, as it stands, the function can return a prime higher than the argument (try it for 3000, for instance). Changing 's=range(3,n+2,2)' to 's=range(3,n+1,2)' would ensure you get all primes up to and including the argument.

  7. 7. At 11:16 a.m. on 24 apr 2005, Paul Miller said:

    Seems to be a bug. For some reason, I get an IndexError for certain prime or prime-power values of n. For example, try n=25, n=251.

  8. 8. At 8:58 a.m. on 7 jun 2005, Giuliano Castelli said:

    purpose for better implementation. Would it be possible to implement an analogous so efficient algorithm as primes(n), but with two inputs, primes(s,n), where s is the starting number from which the primes begin to be calculate? Instead of beginning always from 2. Thank you!

  9. 9. At 11:56 a.m. on 11 oct 2005, Simon Gomizelj said:

    bug fix. as many of you coders have noticed that with this code there are some values that fail, such as 13, 25, 251, and 3000. there is a simple solution to this problem, the line reads

    half = (n+1) / 2

    should read

    half = (n+1) / 2 - 1

    this variable seems to be the size of the list considering it should be 1/2 of the original list's size because we are working only with odd numbers. however, the coder forgot to take into account that he skipped the odd value of 1 and started straight at three. so half really was the size of the list + 1. most numbers, when looping, had increments that we greater than this + 1, but sometime it fell in it, hence giving an element doesnt exist error.

  10. 10. At 2:18 p.m. on 23 jun 2006, Wensheng Wang (the author) said:

    thank all. Thank all the commentors and your suggestions. I have made the correction: s=range(3,n+1,2) and half=(n+1)/2-1

    Thanks.

  11. 11. At 8:27 a.m. on 17 aug 2007, Hugh Brown said:

    More pythonesque. This works the same but has a cleaner loop:

    for i in range(0, int(mroot/2)) :
        m=2*i+3
        if s[i]:
            for j in range((m*m-3)/2, half, m) :
                s[j]=0
    
  12. 12. At 8:51 a.m. on 17 aug 2007, Hugh Brown said:

    More pythonesque. This works the same but has a cleaner loop:

    for i in range(0, int(mroot/2)) :
        m=2*i+3
        if s[i]:
            for j in range((m*m-3)/2, half, m) :
                s[j]=0
    
  13. 13. At 4:17 p.m. on 6 nov 2007, Dr. Drang said:

    Twice as fast? In my benchmarks (on an Intel iMac with n=1,000,000, n=10,000,000, and n=20,000,000), the recipe code is only about 10% faster than the "Alex" code, although its memory use is probably much smaller.

    The code below runs about 50% faster than the "Alex" code and isn't too obscure, I think:

    nroot = int(sqrt(n))
    sieve = range(n+1)
    sieve[1] = 0
    
    for i in xrange(2, nroot+1):
        if sieve[i] != 0:
            m = n/i - i
            sieve[i*i: n+1:i] = [0] * (m+1)
    
    sieve = [x for x in sieve if x !=0]
    

    The slice assignment seems to be the big time saver. The three terms that define the slice are the same as the three terms that define the range for j in the "Alex" code, which is why I think the code is still readable. m is a little tricky; it's the number of multiples of i that are greater than i**2 and less than or equal to n. The integer division used to calculate m will have to be changed for Python 3.

    Since zero is false, I suppose the "!= 0" parts could be omitted from the conditionals, but I think it's easier to read with them in.

  14. 14. At 11:34 p.m. on 8 mar 2009, larsspam said:

    I have made the correction: s=range(3,n+1,2) and half=(n+1)/2-1

    Actually I still encountered the IndexError, even with these corrections. In my case, n was 100000.

    half was 49999.5 and j was 49999.

    I'm using Python 3. I guess the problem is that I need to use the new // integer division operator for computing both half and j, as Dr Drang mentioned.

    Also, for others who may be using Python 3, you'll have to change the initial "s=range(3,n+1,2)" to "s=list(range(3,n+1,2))".

    At least, that's what worked for me, a Python novice. Would be nice if someone more knowledgeable would say something definitive.

  15. 15. At 9:15 a.m. on 7 apr 2009, Shfif py said:

    actually 209 is a prime number , but it isn't showing on your primes(3000) result list

    Someone asked for a function that accepts a lower bound:

    def primes(n,low=1):
      '''return a list of prime numbers between LOW and N'''
      low = low / 2 * 2 + 1   # convert LOW to odd number
      lNum = range(low,n+1,2) # create a list of all the odd numbers between LOW and N
      iRoot= n ** 0.5
      iMid = len(lNum)        # number of numbers in the list
      i = 0
      m = 3                   # 2 is removed from the list because we didn't include the even numbers, so we start with removing
                              # the 3 multiples (9,15,21 ....)
      while m < iRoot:
          if lNum[i] != 0:      # if we haven't removed the number and its multiplies already
                                # I don't like this line, I think it might work in the original algorithm, but might make this one skip
                                # some numbers. If it does skip numbers, remove this 'if' statement
              j = (m*m - low) / 2   # j is the _index_ of the first multiple of m to remove
              while (j<iMid):
    
                      if (j >= 0):        # ignore indexes below zero (for obvious reasons)
                        lNum[j] = 0
                  j+=m
    
          i += 1
          m += 2
    
      return [x for x in lNum if x != 0]
    
  16. 16. At 9:27 a.m. on 7 apr 2009, Shfif py said:

    my mistake 209 / 11 is 19.

  17. 17. At 7:12 a.m. on 16 jun 2009, patrick said:

    For generating a large number of primes, the following improvement to Dr. Drang's code is helpful: def primes_up_to(n): nroot = int(sqrt(n)) sieve = [True] * (n+1) sieve[0] = False sieve[1] = False

        for i in xrange(2, nroot+1):
            if sieve[i]:
                m = n/i - i
                sieve[i*i: n+1:i] = [False] * (m+1)
    
        return [i for i in xrange(n+1) if sieve[i]]
    

    It's the same code, except it uses an array of booleans instead of ints for the sieve. For n = 50 million, it uses about 1/4 the memory and executes 25% faster (9 seconds instead of 12) on my machine.

  18. 18. At 7:13 a.m. on 16 jun 2009, patrick said:

    Sorry, I'm new here and haven't yet learned to use the preview button...

    def primes_up_to(n):
        nroot = int(sqrt(n))
        sieve = [True] * (n+1)# range(n+1)
        sieve[0] = False
        sieve[1] = False
    
        for i in xrange(2, nroot+1):
            if sieve[i]:
                m = n/i - i
                sieve[i*i: n+1:i] = [False] * (m+1)
    
        return [i for i in xrange(n+1) if sieve[i]]
    

Sign in to comment