Sieve of Eratosthenes « Python recipes

Computes an infinite sequence of primes using simple generators. A Python dictionary is used to mark multiples of the generated primes, according to the Sieve of Eratosthenes.

      # Sieve of Eratosthenes
# David Eppstein, UC Irvine, 28 Feb 2002

from __future__ import generators

def eratosthenes():
	'''Yields the sequence of prime numbers via the Sieve of Eratosthenes.'''
	D = {}  # map composite integers to primes witnessing their compositeness
	q = 2   # first integer to test for primality
	while 1:
		if q not in D:
			yield q        # not marked composite, must be prime
			D[q*q] = [q]   # first multiple of q not already marked
		else:
			for p in D[q]: # move each witness to its next multiple
				D.setdefault(p+q,[]).append(p)
			del D[q]       # no longer need D[q], free memory
		q += 1

      

As Wim Stolker suggested, I have modified this to use setdefault, 1 Mar 2001.

Tags: algorithms

17 comments

Wim Stolker 22 years, 2 months ago # | flag

Shortcut. Nice short and clear recipe, however,

Instead of:

if D.has_key(p+q):
    D[p+q].append(p)
else:
    D[p+q] = [p]

I would use:

D.setdefault(p+q,[]).append(p)

See: Add an entry to a dictionary, unless the entry is already there.

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/66516

Alex Martelli 21 years, 1 month ago # | flag

slightly faster by keeping a dict of single numbers instead of one of lists. neat indeed, but we can be marginally faster (about 40%+ on my machine for primes up to 1,000,000 for example) by keeping in the dict a single value for each composite number -- the first prime factor that revealed to us that number is a composite. Here's the Python 2.3 code for the enhanced main loop:

while True:
    p = D.pop(q, None)
    if p:
        x = p + q
        while x in D: x += p
        D[x] = p
    else:
        D[q*q] = q
        yield q
    q += 1

this uses the new-in-2.3 "fetch and remove" method .pop of dictionaries, but that's marginal (and I put just the same small enhancement in the list-using version -- I posted both to comp.lang.python just now) -- the small speedup in this variant comes from not having to build so many small lists.

Alex

N N 18 years, 6 months ago # | flag

I hapen to like haskell's. haskell's canonical version:

primes = sieve [ 2.. ]
         where
         sieve (p:x) = p : sieve [ n | n &lt;- x, n `mod` p &gt; 0 ]

translates roughly to python as:

import itertools as it
def sieve(num=it.count(2)):
    p=num.next()
    x=sieve(n for n in num if n%p)
    yield p
    while True: yield x.next()

Unfortunately it exceeds the recursion limit after about 3000 on my machine.

Tim Hochberg 18 years, 2 months ago # | flag

Slightly faster yet by skipping even numbers.

def sieve():
    yield 2
    D = {}
    q = 3
    while True:
        p = D.pop(q, 0)
        if p:
            x = q + p
            while x in D: x += p
            D[x] = p
        else:
            yield q
            D[q*q] = 2*q
        q += 2

It's too bad itertools.count doesn't take a step argument! This would look cleaner, and likely be marginally faster, with

for q in count(3,step=2)

rather than the current while True cruft.

Tim Hochberg 18 years, 2 months ago # | flag

Another recursive version.

from itertools import count, ifilter
def sieve():
    seq = count(2)
    while True:
        p = seq.next()
        seq = ifilter(p.__rmod__, seq)
        yield p

This uses ifilter to generate the recursive sequence. It's much less vulnerable to stack overflow than the version that recurses on sieve, but otherwise it's pretty similar.

I managed to generate all primes up to 909,691 before it bombed the Python interpreter mysteriously, but it takes a while!

Matteo Dell'Amico 15 years, 2 months ago # | flag

Faster generation of primes up to n.

I can get a roughly 2.5x speedup when generating primes up to a given number. By keeping the whole filter in memory, it's possible to do list lookup instead of dictionary lookup.

def primes_up_to(n):
    """Generates all primes less than n."""
    if n <= 2: return
    yield 2
    F = [True] * n
    seq1 = xrange(3, int(sqrt(n)), 2)
    seq2 = xrange(seq1[-1] + 2, n, 2)
    for p in ifilter(F.__getitem__, seq1):
        yield p
        for q in xrange(p * p, n, 2 * p):
            F[q] = False
    for p in ifilter(F.__getitem__, seq2):
        yield p

Thomas Ahle 14 years, 5 months ago # | flag

I managed to create a python onliner based on #3.

It is not terrible efficient memorywise, but it is a Sieve.

Tell me if you can do it smaller:

>>> primes = lambda l: l and l[:1]+primes([n for n in l if n%l[0]])
>>> primes(range(2,40))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]

mojave kid 14 years, 4 months ago # | flag

Im pitching in my implementation :

def GetPrimes(limit):

prime_list = range(2,limit+1)

for elem in prime_list:
    if elem**2 <= max:
        for temp in prime_list:
            if temp == elem:
                pass
            else:
                if temp % elem == 0:
                    prime_list.remove(temp) 

return prime_list

Wolfgang Beneicke 14 years, 1 month ago # | flag

I noticed that the dict method applied to odd numbers only (see # 5) has to check for even numbers while searching for the next multiple (e.g. q=9, p=3 leads to D[12] being set unnecessarily). So someone suggested:

        while x in D or not (x&1):
            x += p

But, iff q and p are odd, q+2*p will always be odd. This shaves off another 30% of the runtime of this already slick algorithm.

Additionally, the 'while' loop is replaced by the itertools.count() iterator, as has been suggested before.

import itertools

D = {}
yield 2    # no need to mark D[4] as we will test odd numbers only

for q in itertools.islice(itertools.count(3),0,None,2):
    p = D.pop(q,None)      # get and remove
    if p:                      #  is composite
        x = q + p+p                # next odd(!) multiple
        while x in D:              # skip primes
            x += p+p
        D[x] = p
    else:                      # is prime
        D[q*q] = q
        yield q

Rahul Raj 13 years, 11 months ago # | flag

Clean and clear code.. But do not scale well for big numbers.. What do you guys think about my version here http://rahulrajpl.wordpress.com/2010/05/15/sieve_of_eratosthenes-with-efficient-scaling/

Mepuka Kessy 13 years, 10 months ago # | flag

def iprime_generator(n):

startlst = range(2, n+1)
primes = []

while startlst[0]*startlst[0] < n:
    primes = primes + [startlst[0]]
    startlst = [elem for elem in startlst if elem%startlst[0]!=0]

return primes + startlst

It seems fairly efficient for numbers less than a million, I'm a noob so I'm still trying to figure out how to optimize it. Please post feedback, thanks!

Paul Hofstra 12 years, 6 months ago # | flag

The code of Wolfgang Beneicke still can be improved a little. The following code is about 10% faster.

import  itertools

def erat2a():
    D = {}
    yield 2
    for q in itertools.islice(itertools.count(3), 0, None, 2):
        p = D.pop(q, None)
        if p is None:
            D[q * q] = 2 * q # use here 2 * q
            yield q
        else:
            x = p + q
            while x in D:
                x += p
            D[x] = p

Will Ness 11 years, 11 months ago # | flag

The space complexity is needlessly O(n). O(sqrt(n)) is enough indeed, achieved by postponing the addition of stepping information for a prime into the dict until that prime's square is seen amongst the candidates, and not a moment sooner. Having much smaller dict, performance and empirical time complexity improve too. See http://ideone.com/WFv4f.

Will Ness 11 years, 11 months ago # | flag

That's http://ideone.com/WFv4f (why can't I edit my comment?...).

Will Ness 11 years, 7 months ago # | flag

Here's the code for the postponed sieve mentioned above:

def postponed_sieve():                    # postponed sieve, by Will Ness      
  yield 2; yield 3; yield 5; yield 7;     #        http://ideone.com/WFv4f
  D = {}                                  # based on original code by
  c = 9                                   #   David Eppstein, UC Irvine, 28 Feb 2002
  ps = (p for p in postponed_sieve())     #   from http://code.activestate.com/
  p = ps.next() and ps.next()             # 3       recipes/117119-sieve-of-eratosthenes/
  q = p*p                                 # 9
  while True:
    if c not in D:
        if c < q: yield c
        else:
            add(D,c + 2*p,2*p)
            p=ps.next()
            q=p*p
    else:
        s = D.pop(c)
        add(D,c + s,s)
    c += 2

def add(D,x,s):
  while x in D: x += s
  D[x] = s

Bjorn Madsen 11 years, 1 month ago # | flag

Well to complement the ones above; here is a "set"-based implementation.

def sieve(n):
    z = {2} # set of primes
    N = {x for x in range(3,n+1,2)} # natural numbers in range.
    size=len(N)
    while size>0: # stop when N is exhausted.
        sn = min(N) # sn: smallest number in N
        for i in z:
            if sn%i==0: # it is divisible with a prime and cannot be a prime.
                m = {n for n in range(sn,n+1,sn)} # m is set of multiples of sn
                N = N - m # using deduction of sets to update N
            elif sn<i*i: # match-point has been passed.
                z.add(sn)
                m = {n for n in range(sn,n+1,sn)} # m is set of multiples of sn
                N = N - m # using deduction of sets to update N
                break
        size=len(N)
    return z

if __name__ =='__main__':
    import time
    start=time.clock()
    L = sieve(104729+1) # 104729 prime number 10,000
    end=time.clock()
    print 'total runtime: ' + str(end-start)

total runtime: 15.5625 for the first 104730 numbers.

Matteo Dell'Amico 8 years, 1 month ago # | flag

There was a bug in my 7-years old implementation (a missing '+ 1'); thanks to @xn0 who spotted it.

Here's the corrected version:

def primes_up_to(n):
    """Generates all primes less than n."""
    if n <= 2: return
    yield 2
    F = [True] * n
    seq1 = xrange(3, int(sqrt(n)) + 1, 2)
    seq2 = xrange(seq1[-1] + 2, n, 2)
    for p in ifilter(F.__getitem__, seq1):
        yield p
        for q in xrange(p * p, n, 2 * p):
            F[q] = False
    for p in ifilter(F.__getitem__, seq2):
        yield p

◄	Python recipes (4591)	►
◄	David Eppstein's recipes (14)	►

Sieve of Eratosthenes (Python recipe) by David Eppstein
ActiveState Code (http://code.activestate.com/recipes/117119/)

17 comments

It seems fairly efficient for numbers less than a million, I'm a noob so I'm still trying to figure out how to optimize it. Please post feedback, thanks!

total runtime: 15.5625 for the first 104730 numbers.

Tags

Required Modules

Other Information and Tasks

Accounts

Code Recipes

Feedback & Information

ActiveState

Sieve of Eratosthenes (Python recipe) by David Eppstein ActiveState Code (http://code.activestate.com/recipes/117119/)

17 comments

It seems fairly efficient for numbers less than a million, I'm a noob so I'm still trying to figure out how to optimize it. Please post feedback, thanks!

total runtime: 15.5625 for the first 104730 numbers.

Tags

Required Modules

Other Information and Tasks

Accounts

Code Recipes

Feedback & Information

ActiveState

Sieve of Eratosthenes (Python recipe) by David Eppstein
ActiveState Code (http://code.activestate.com/recipes/117119/)