Welcome, guest | Sign In | My Account | Store | Cart

Computes an infinite sequence of primes using simple generators. A Python dictionary is used to mark multiples of the generated primes, according to the Sieve of Eratosthenes.

Python, 18 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Sieve of Eratosthenes
# David Eppstein, UC Irvine, 28 Feb 2002

from __future__ import generators

def eratosthenes():
	'''Yields the sequence of prime numbers via the Sieve of Eratosthenes.'''
	D = {}  # map composite integers to primes witnessing their compositeness
	q = 2   # first integer to test for primality
	while 1:
		if q not in D:
			yield q        # not marked composite, must be prime
			D[q*q] = [q]   # first multiple of q not already marked
		else:
			for p in D[q]: # move each witness to its next multiple
				D.setdefault(p+q,[]).append(p)
			del D[q]       # no longer need D[q], free memory
		q += 1

As Wim Stolker suggested, I have modified this to use setdefault, 1 Mar 2001.

17 comments

Wim Stolker 22 years, 1 month ago  # | flag

Shortcut. Nice short and clear recipe, however,

Instead of:

if D.has_key(p+q):
    D
[p+q].append(p)
else:
    D
[p+q] = [p]

I would use:

D.setdefault(p+q,[]).append(p)

See: Add an entry to a dictionary, unless the entry is already there.

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/66516

Alex Martelli 21 years, 1 month ago  # | flag

slightly faster by keeping a dict of single numbers instead of one of lists. neat indeed, but we can be marginally faster (about 40%+ on my machine for primes up to 1,000,000 for example) by keeping in the dict a single value for each composite number -- the first prime factor that revealed to us that number is a composite. Here's the Python 2.3 code for the enhanced main loop:

while True:
    p
= D.pop(q, None)
   
if p:
        x
= p + q
       
while x in D: x += p
        D
[x] = p
   
else:
        D
[q*q] = q
       
yield q
    q
+= 1

this uses the new-in-2.3 "fetch and remove" method .pop of dictionaries, but that's marginal (and I put just the same small enhancement in the list-using version -- I posted both to comp.lang.python just now) -- the small speedup in this variant comes from not having to build so many small lists.

Alex

N N 18 years, 6 months ago  # | flag

I hapen to like haskell's. haskell's canonical version:

primes = sieve [ 2.. ]
         
where
         sieve
(p:x) = p : sieve [ n | n <- x, n `mod` p > 0 ]

translates roughly to python as:

import itertools as it
def sieve(num=it.count(2)):
    p
=num.next()
    x
=sieve(n for n in num if n%p)
   
yield p
   
while True: yield x.next()

Unfortunately it exceeds the recursion limit after about 3000 on my machine.

Tim Hochberg 18 years, 2 months ago  # | flag

Slightly faster yet by skipping even numbers.

def sieve():
   
yield 2
    D
= {}
    q
= 3
   
while True:
        p
= D.pop(q, 0)
       
if p:
            x
= q + p
           
while x in D: x += p
            D
[x] = p
       
else:
           
yield q
            D
[q*q] = 2*q
        q
+= 2

It's too bad itertools.count doesn't take a step argument! This would look cleaner, and likely be marginally faster, with

for q in count(3,step=2)

rather than the current while True cruft.

Tim Hochberg 18 years, 2 months ago  # | flag

Another recursive version.

from itertools import count, ifilter
def sieve():
    seq
= count(2)
   
while True:
        p
= seq.next()
        seq
= ifilter(p.__rmod__, seq)
       
yield p

This uses ifilter to generate the recursive sequence. It's much less vulnerable to stack overflow than the version that recurses on sieve, but otherwise it's pretty similar.

I managed to generate all primes up to 909,691 before it bombed the Python interpreter mysteriously, but it takes a while!

Matteo Dell'Amico 15 years, 2 months ago  # | flag

Faster generation of primes up to n.

I can get a roughly 2.5x speedup when generating primes up to a given number. By keeping the whole filter in memory, it's possible to do list lookup instead of dictionary lookup.

def primes_up_to(n):
   
"""Generates all primes less than n."""
   
if n <= 2: return
   
yield 2
    F
= [True] * n
    seq1
= xrange(3, int(sqrt(n)), 2)
    seq2
= xrange(seq1[-1] + 2, n, 2)
   
for p in ifilter(F.__getitem__, seq1):
       
yield p
       
for q in xrange(p * p, n, 2 * p):
            F
[q] = False
   
for p in ifilter(F.__getitem__, seq2):
       
yield p
Thomas Ahle 14 years, 5 months ago  # | flag

I managed to create a python onliner based on #3.

It is not terrible efficient memorywise, but it is a Sieve.

Tell me if you can do it smaller:

>>> primes = lambda l: l and l[:1]+primes([n for n in l if n%l[0]])
>>> primes(range(2,40))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]
mojave kid 14 years, 4 months ago  # | flag

Im pitching in my implementation :

def GetPrimes(limit):

prime_list = range(2,limit+1)

for elem in prime_list:
   
if elem**2 <= max:
       
for temp in prime_list:
           
if temp == elem:
               
pass
           
else:
               
if temp % elem == 0:
                    prime_list
.remove(temp)

return prime_list
Wolfgang Beneicke 14 years ago  # | flag

I noticed that the dict method applied to odd numbers only (see # 5) has to check for even numbers while searching for the next multiple (e.g. q=9, p=3 leads to D[12] being set unnecessarily). So someone suggested:

        while x in D or not (x&1):
            x
+= p

But, iff q and p are odd, q+2*p will always be odd. This shaves off another 30% of the runtime of this already slick algorithm.

Additionally, the 'while' loop is replaced by the itertools.count() iterator, as has been suggested before.

import itertools

D
= {}
yield 2    # no need to mark D[4] as we will test odd numbers only

for q in itertools.islice(itertools.count(3),0,None,2):
    p
= D.pop(q,None)      # get and remove
   
if p:                      #  is composite
        x
= q + p+p                # next odd(!) multiple
       
while x in D:              # skip primes
            x
+= p+p
        D
[x] = p
   
else:                      # is prime
        D
[q*q] = q
       
yield q
Rahul Raj 13 years, 11 months ago  # | flag

Clean and clear code.. But do not scale well for big numbers.. What do you guys think about my version here http://rahulrajpl.wordpress.com/2010/05/15/sieve_of_eratosthenes-with-efficient-scaling/

Mepuka Kessy 13 years, 10 months ago  # | flag

def iprime_generator(n):

startlst = range(2, n+1)
primes
= []

while startlst[0]*startlst[0] < n:
    primes
= primes + [startlst[0]]
    startlst
= [elem for elem in startlst if elem%startlst[0]!=0]

return primes + startlst
It seems fairly efficient for numbers less than a million, I'm a noob so I'm still trying to figure out how to optimize it. Please post feedback, thanks!
Paul Hofstra 12 years, 6 months ago  # | flag

The code of Wolfgang Beneicke still can be improved a little. The following code is about 10% faster.

import  itertools

def erat2a():
    D
= {}
   
yield 2
   
for q in itertools.islice(itertools.count(3), 0, None, 2):
        p
= D.pop(q, None)
       
if p is None:
            D
[q * q] = 2 * q # use here 2 * q
           
yield q
       
else:
            x
= p + q
           
while x in D:
                x
+= p
            D
[x] = p
Will Ness 11 years, 11 months ago  # | flag

The space complexity is needlessly O(n). O(sqrt(n)) is enough indeed, achieved by postponing the addition of stepping information for a prime into the dict until that prime's square is seen amongst the candidates, and not a moment sooner. Having much smaller dict, performance and empirical time complexity improve too. See http://ideone.com/WFv4f.

Will Ness 11 years, 11 months ago  # | flag

That's http://ideone.com/WFv4f (why can't I edit my comment?...).

Will Ness 11 years, 7 months ago  # | flag

Here's the code for the postponed sieve mentioned above:

def postponed_sieve():                    # postponed sieve, by Will Ness      
 
yield 2; yield 3; yield 5; yield 7;     #        http://ideone.com/WFv4f
  D
= {}                                  # based on original code by
  c
= 9                                   #   David Eppstein, UC Irvine, 28 Feb 2002
  ps
= (p for p in postponed_sieve())     #   from http://code.activestate.com/
  p
= ps.next() and ps.next()             # 3       recipes/117119-sieve-of-eratosthenes/
  q
= p*p                                 # 9
 
while True:
   
if c not in D:
       
if c < q: yield c
       
else:
            add
(D,c + 2*p,2*p)
            p
=ps.next()
            q
=p*p
   
else:
        s
= D.pop(c)
        add
(D,c + s,s)
    c
+= 2

def add(D,x,s):
 
while x in D: x += s
  D
[x] = s
Bjorn Madsen 11 years, 1 month ago  # | flag

Well to complement the ones above; here is a "set"-based implementation.

def sieve(n):
    z
= {2} # set of primes
    N
= {x for x in range(3,n+1,2)} # natural numbers in range.
    size
=len(N)
   
while size>0: # stop when N is exhausted.
        sn
= min(N) # sn: smallest number in N
       
for i in z:
           
if sn%i==0: # it is divisible with a prime and cannot be a prime.
                m
= {n for n in range(sn,n+1,sn)} # m is set of multiples of sn
                N
= N - m # using deduction of sets to update N
           
elif sn<i*i: # match-point has been passed.
                z
.add(sn)
                m
= {n for n in range(sn,n+1,sn)} # m is set of multiples of sn
                N
= N - m # using deduction of sets to update N
               
break
        size
=len(N)
   
return z

if __name__ =='__main__':
   
import time
    start
=time.clock()
    L
= sieve(104729+1) # 104729 prime number 10,000
   
end=time.clock()
   
print 'total runtime: ' + str(end-start)

total runtime: 15.5625 for the first 104730 numbers.

Matteo Dell'Amico 8 years, 1 month ago  # | flag

There was a bug in my 7-years old implementation (a missing '+ 1'); thanks to @xn0 who spotted it.

Here's the corrected version:

def primes_up_to(n):
   
"""Generates all primes less than n."""
   
if n <= 2: return
   
yield 2
    F
= [True] * n
    seq1
= xrange(3, int(sqrt(n)) + 1, 2)
    seq2
= xrange(seq1[-1] + 2, n, 2)
   
for p in ifilter(F.__getitem__, seq1):
       
yield p
       
for q in xrange(p * p, n, 2 * p):
            F
[q] = False
   
for p in ifilter(F.__getitem__, seq2):
       
yield p
Created by David Eppstein on Thu, 28 Feb 2002 (PSF)
Python recipes (4591)
David Eppstein's recipes (14)

Required Modules

Other Information and Tasks