Completely eliminates rounding errors and loss of significance due to catastrophic cancellation during summation. Achieves exactness by keeping full precision intermediate subtotals. Offers three alternative approaches, each using a different technique to store exact subtotals.
Python, 109 lines
There are other recipes that mitigate round-off errors during floating point summation (see recipe 298339 for example). This one goes beyond mitigation and is provably exact. Other features include an O(n) typical runtime, a tiny memory footprint, and accepting any iterable input.
Tim Peters provided a good test case that defeats some other attempts at accurate summation:
The msum() function maintains an array of partial sums stored as floats. The array is kept in increasing order of magnitude except for the last entry which can be zero. Each summand is non-overlapping (the lowest non-zero bit of the larger value is greater than the highest bit of the smaller value). Together, the array components span the full range of precision in the exact sum. In practice, the list of partial sums rarely has more than a few entries.
For proof of msum()'s correctness, see Shewchuk's "Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates" at http://www-2.cs.cmu.edu/afs/cs/project/quake/public/papers/robust-arithmetic.ps . The exact summation of two floats, x+y, is proved in theorem 6. Keeping the overall sum exact through hi/lo summations is proved in theorem 10. The results follow from IEEE-754 arithmetic guarantees such as subtraction being exact whenever the two components are within a factor of two of each other. The algorithms work under various rounding modes including round-to-nearest (the 754 default).
The paper includes a caveat (in section 5) that the computation of lo may be incorrectly optimized away by some C compilers. This does not occur in Python code which has no algebraic optimizations beyond constant folding. Running dis.dis(msum) demonstrates that each addition and subtraction has its own opcode and has not been algebraically reduced.
A second function, lsum(), employs an alternative approach and provides an independent cross-check on msum(). It also computes a full precision floating point sum but keeps the exact intermediate subtotals in the form of an integer exponent and a full precision mantissa stored as a long integer. At each step, either the addend or cumulative sum is left shifted until the two have a common exponent, then the mantissas are summed directly. The running total is always exactly equal to: tmant * 2.0 ** texp.
Another approach is shown in a third function, dsum(), which uses Decimal objects for the exact intermediate subtotals. The math.frexp() function exactly splits the float into a mantissa and exponent. The mantissas are doubled and exponents are lowered until the two components are both integers. From these two integers, the float's value is exactly computed as a Decimal. The context precision grows as needed to accommodate all significant digits.
All five functions give identical answers, accurate to full precision. They each take different approaches to storing exact intermediate subtotals (i.e. multiple floats, a long integer, or a decimal object). After the summation, the exact result is rounded to a regular float, accurate to within about one half unit in the last place.
----- Additional notes ------------------------
The code for dsum() and lsum() assume 53 bit precision floats. To remove that assumption, replace this line:
Also, dsum() and lsum() build the result float value using float(str(x)) where x is either a Decimal or a long integer. In both cases, CPython employs the platform strtod() function whose round-off/truncation behavior is not fully specified by C89 or IEEE-754's string-to-float conversion rules (special thanks to Tim Peters for this information). For exact subtotals falling almost exactly between two representable floating point values, the least significant bit can be misrounded by some systems (hence, the error bound is very slightly more than ½ unit in the last place). On my system, the 1 bit misrounding occurs infrequently (once out of every several thousand calls to lsum()). Regardless, since rounding occurs after exact summation, the result is consistent for all data orderings (i.e. the commutative law of addition holds exactly).
The lsum() function runs on Py2.0 (when augment assignment was introduced). The msum() function runs on Py2.3 (when the builtin sum() function was introduced). Both lsum() and dsum() are easily modified to work on very old versions of Python. In contrast, the dsum() function depends completely on Python 2.4's decimal module and it does not run nearly as fast as msum() and lsum().