Welcome, guest | Sign In | My Account | Store | Cart

Built-in "sum" function, as well as add.reduce functions in Numeric/numarray introduce a large error when summing large arrays of like elements. I got relative error of about 1e-9 after summing 10 million doubles between 0 and 1. Function below has error less than 1e-15, doesn't use any additional memory (although it destroys the data array), and also runs asymptotically faster for unlimited precision numbers.

Python, 12 lines
 1 2 3 4 5 6 7 8 9 10 11 12 import math def sum_with_partials(arr): size = len(arr) iters = math.ceil(math.log(size)/math.log(2)) for itr in range(int(iters)): step = 2**itr for i in xrange(0, size, 2**(itr+1)): next_i = i+step if next_i

You can see that sum([0.123456789012345]*10000000) will be off by almost a 0.001 from the true answer. The solution above will give exact answer for this array. I also got absolute errors less than 1e-15 when trying it on arrays of random numbers. For unlimited precision numbers it also works asymptotically faster. More precisely regular sum takes O(n(d+log d)) while function above takes O(nd) where n is number of elements in array, d is number of digits in each element.

On my machine the function above takes 28% faster for adding 10,000 Decimals than built-in sum. For longs, it takes 10 times longer, presumably because the built-in is optimized for longs Bob Ippolito 17 years, 5 months ago

That only works for lists, the Python implementation works for arbitrary sequences (anything iterable). Connelly Barnes 17 years, 5 months ago

Note on asymptotic time. The O(n(d+log d)) time is for exact decimal/binary arithmetic.

In some cases, floating point manipulations are done with a fixed maximum precision. After this precision, digits are truncated. This could be called truncated arithmetic, and it would take O(nd) time to perform a regular Python sum() in this mode.

• Connelly Eric-Olivier LE BIGOT 16 years, 2 months ago

Simpler, more precise, but slower version. Nice recipe!

A simpler, more precise (with slowly varying numbers), but slower and more memory-consuming version would be:

def sum_with_partials(elements):
if len(elements) == 1: return elements
return sum_with_partials(elements[::2]) + sum_with_partials(elements[1::2])

The advantage of this scheme is that only numbers with similar values are summed together, for smoothly varying elements; this precedure minimizes numerical errors, in floating-point arithmetic. In the original recipe, lists with 2**n+1 elements especially suffer from a loss of precision when the last element is added to the sum of all the previous ones.

(Side remarks: Speaking of speed, the original recipe is about 5x slower than the built-in sum with Python 2.4.2, on my machine--but the cost in time is certainly worth the price, in some situations.) Chris Fuller 15 years, 3 months ago

Speed tips. I haven't checked to see what impact this has on the whole algorithm's performace, but 1<<10 is about 30% faster than 2**10 for me (python ought to optimise that one!) Also, you might save a bit by importing ceil and log into the module's namespace, rather than looking up two symbols each time. Created by Yaroslav Bulatov on Wed, 4 Aug 2004 (PSF)