Welcome, guest | Sign In | My Account | Store | Cart

Needed formatted numbers with thousands separator commas added on an end-user report. The usual way mentioned is to use 'module locale.format', but that didn't work cleanly on my Windows machine, and the cure seemed worse than the disease.

Python, 77 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
import re

__test__ = {}

re_digits_nondigits = re.compile(r'\d+|\D+')

__test__['re_digits_nondigits'] = r"""

    >>> re_digits_nondigits.findall('$1234.1234')
    ['$', '1234', '.', '1234']
    >>> re_digits_nondigits.findall('1234')
    ['1234']
    >>> re_digits_nondigits.findall('')
    []
    
"""

def FormatWithCommas(format, value):
    """

    >>> FormatWithCommas('%.4f', .1234)
    '0.1234'
    >>> FormatWithCommas('%i', 100)
    '100'
    >>> FormatWithCommas('%.4f', 234.5678)
    '234.5678'
    >>> FormatWithCommas('$%.4f', 234.5678)
    '$234.5678'
    >>> FormatWithCommas('%i', 1000)
    '1,000'
    >>> FormatWithCommas('%.4f', 1234.5678)
    '1,234.5678'
    >>> FormatWithCommas('$%.4f', 1234.5678)
    '$1,234.5678'
    >>> FormatWithCommas('%i', 1000000)
    '1,000,000'
    >>> FormatWithCommas('%.4f', 1234567.5678)
    '1,234,567.5678'
    >>> FormatWithCommas('$%.4f', 1234567.5678)
    '$1,234,567.5678'
    >>> FormatWithCommas('%i', -100)
    '-100'
    >>> FormatWithCommas('%.4f', -234.5678)
    '-234.5678'
    >>> FormatWithCommas('$%.4f', -234.5678)
    '$-234.5678'
    >>> FormatWithCommas('%i', -1000)
    '-1,000'
    >>> FormatWithCommas('%.4f', -1234.5678)
    '-1,234.5678'
    >>> FormatWithCommas('$%.4f', -1234.5678)
    '$-1,234.5678'
    >>> FormatWithCommas('%i', -1000000)
    '-1,000,000'
    >>> FormatWithCommas('%.4f', -1234567.5678)
    '-1,234,567.5678'
    >>> FormatWithCommas('$%.4f', -1234567.5678)
    '$-1,234,567.5678'
    
    """

    parts = re_digits_nondigits.findall(format % (value,))
    for i in xrange(len(parts)):
        s = parts[i]
        if s.isdigit():
            parts[i] = _commafy(s)
            break
    return ''.join(parts)
    
def _commafy(s):

    r = []
    for i, c in enumerate(reversed(s)):
        if i and (not (i % 3)):
            r.insert(0, ',')
        r.insert(0, c)
    return ''.join(r)

The recipe works by adding commas to the first contiguous group of digits. It could fail with some odd-ball format that puts extra digits before the number.

15 comments

ralph heimburger 17 years, 6 months ago  # | flag

Another solution. #this solution uses the re module and a single function.

import re def currency(amount): temp = "%.2f" % amount profile=re.compile(r"(\d)(\d\d\d[.,])") while 1: temp, count = re.subn(profile,r"\1,\2",temp) if not count: break return '$'+temp

if __name__ == "__main__": print currency(3458905.54) print currency(-49786002.40)

Tim Keating 16 years, 5 months ago  # | flag

Using locale. FYI, the reason using locale didn't work is probably because you didn't set the locale first:

>>> locale.setlocale(locale.LC_ALL, "")
'English_United States.1252'
>>> locale.format('%d', 12345, True)
'12,345'

Counterintuitive, not well documented, non-Pythonic and just plain sucky, but there you go.

Tim Ruddick 15 years, 5 months ago  # | flag

Reverse the digits, add commas after each group of three (except the last), then reverse the result.

Nice and simple, but only works for integers; split on '\.' and rejoin if you have a decimal point.

import re
def thous(x):
    return re.sub(r'(\d{3})(?=\d)', r'\1,', str(x)[::-1])[::-1]
Luciano Ramalho 15 years, 1 month ago  # | flag

Here is a solution that does not use a regex:

def splitthousands(s, sep=','):  
    if len(s) <= 3: return s  
    return splitthousands(s[:-3], sep) + sep + s[-3:]
Alexander Whiting 14 years, 5 months ago  # | flag

Neat Luciano. So a full version for 2 decimal place numbers would be:

def splitthousands(s, sep=','):
sign='' rhs='' if s[0]=='-': sign='-' s=s[1:] if s.rfind('.')>0: rhs=s[-3:] s=s[:-3] if len(s) <= 3: return sign+s+rhs return sign+splitthousands(s[:-3], sep) + sep + s[-3:]+rhs

Luca Dentis 14 years, 3 months ago  # | flag

a generalized - and, by the way, internationalized - version of solutions proposed by Luciano and Alexander is the following:

def splitThousands(s, tSep, dSep=None):
    if s.rfind('.')>0:
        rhs=s[s.rfind('.')+1:]
        s=s[:s.rfind('.')-1]
        if len(s) <= 3: return s + dSep + rhs
        return splitThousands(s[:-3], tSep) + tSep + s[-3:] + dSep + rhs
    else:
        if len(s) <= 3: return s
        return splitThousands(s[:-3], tSep) + tSep + s[-3:]

where:

  • tSep is the thousands' separator, that in the United States is the comma, but, for instance, in France is the space and in Italy and in Germany is the dot;

  • dSep is the decimal separator, that in the United States is the dot, but in the other above mentioned countries is the comma. dSep has "None" as default value since integers haven't a decimal part.

Duane Harkness 14 years, 2 months ago  # | flag

The solution proposed by Luciano and Alexander needs a slight change to the fourth line, otherwise it drops the first digit before the decimal place (e.g. "12345.6" returns "1,234.6"). Here is an updated version:

def splitThousands(s, tSep, dSep=None):
    if s.rfind('.')>0:
        rhs=s[s.rfind('.')+1:]
        s=s[:s.rfind('.')]
        if len(s) <= 3: return s + dSep + rhs
        return splitThousands(s[:-3], tSep) + tSep + s[-3:] + dSep + rhs
    else:
        if len(s) <= 3: return s
        return splitThousands(s[:-3], tSep) + tSep + s[-3:]
Duane Harkness 14 years, 2 months ago  # | flag

Sorry. In my previous post I was actually referring to the solution proposed by Luca Dentis.

Eric Berlin 14 years, 2 months ago  # | flag

I'd suggest a further refinement. Since you are using a decimal separator variable, that should be used when searching for where to split into the right and left hand sides. Perhaps it would be a good idea to use a reasonable default for dSep and check if it's a decimal that way instead of using the "search for a dot." The whole point of parameterizing tSep and dSep was to allow for internationalization.

# Duane Harkness
# http://code.activestate.com/recipes/498181/
# This function takes three parameters:
#     Number as string to format,
#     Thousands separator,
#     Decimal separator (Period by default). - ECJB.
def splitThousands(s, tSep, dSep="."):
        if dSep != "" and s.rfind(dSep) > 0: # It's a decimal.  Short circuit test if dSep not defined.   Removed hard coded "."
                rhs = s[s.rfind(dSep) + 1:]              # Find decimal part.  Removed hard coded "."
                s = s[:s.rfind(dSep)]                    # Find integer part.
                if len(s) <= 3: return s + dSep + rhs    # If integer part < 1000, just return original number.
                # Recursively add thousands separator, decimal separator, and decimal part.
                return splitThousands(s[:-3], tSep) + tSep + s[-3:] + dSep + rhs
        else:   # Number must be an integer, unless dSep is misdefined.
                if len(s) <= 3: return s                 # Same rules as above, minus the decimal portion.
                # Recursively add thousands separator.
                return splitThousands(s[:-3], tSep) + tSep + s[-3:]
Michael Robellard 14 years, 1 month ago  # | flag

With support for negative numbers added:

def splitThousands(s, tSep=",", dSep='.'):
    if s == None:
        return 0
    if isinstance(s, int) or isinstance(s,long) or isinstance(s,float):
        s = str(s)
    if dSep != "" and s.rfind(dSep)>0:
        rhs=s[s.rfind(dSep)+1:]
        s=s[:s.rfind(dSep)]
        if len(s) <= 3 or (len(s) == 4 and s[0] == '-'): return s + dSep + rhs
        return splitThousands(s[:-3], tSep) + tSep + s[-3:] + dSep + rhs
    else:
        if len(s) <= 3 or (len(s) == 4 and s[0] == '-'): return s
        return splitThousands(s[:-3], tSep) + tSep + s[-3:]
Glenn 14 years, 1 month ago  # | flag

Handle leading +

# Code from Michael Robellard's comment made 28 Feb 2010
# Modified for leading + on 1 Mar 2010 by Glenn Linderman
def splitThousands(s, tSep=',', dSep='.'):
    if s == None:
        return 0
    if isinstance(s, int) or isinstance(s,long) or isinstance(s,float):
        s = str(s)
    if s[0] == '-' or s[0] == '+':
        lhs=s[0]
        s=s[1:]
    else:
        lhs=''
    if dSep != '' and s.rfind(dSep)>0:
        rhs=s[s.rfind(dSep)+1:]
        s=s[:s.rfind(dSep)]
        if len(s) <= 3: return lhs + s + dSep + rhs
        return lhs + splitThousands(s[:-3], tSep) + tSep + s[-3:] + dSep + rhs
    else:
        if len(s) <= 3: return lhs + s
        return lhs + splitThousands(s[:-3], tSep) + tSep + s[-3:]
Glenn 14 years, 1 month ago  # | flag

And maybe

if isinstance(s, int) or isinstance(s,long) or isinstance(s,float):

should instead be

if not isinstance(s, str):

to handle more types that happen to look numeric when stringified, and also to work better on Python 3.x, which doesn't have long.

Glenn 14 years, 1 month ago  # | flag

This version factors out the recursion to reduce the number of redundant checks. splitThousandsPosInt only handles strings of digits (or whatever Garbage In). splitThousands handles any number of leading spaces or signs, and stops at a decimal point. GIGO, of course. Pass in properly formatted numbers and it will work.

#####################################
def _splitThousandsHelper( s, tSep ):
    if len( s ) <= 3: return s
    return _splitThousandsHelper( s[ :-3 ], tSep ) + tSep + s[ -3: ]
#####################################
def splitThousandsPosInt( s, tSep=',' ):
    if not isinstance( s, str ):
        s = str( s )
    return _splitThousandsHelper( s, tSep )
#####################################
# http://code.activestate.com/recipes/498181-add-thousands-separator-commas-to-formatted-number/
# Code from Michael Robellard's comment made 28 Feb 2010
# Modified for leading +, -, space on 1 Mar 2010 by Glenn Linderman
def splitThousands( s, tSep=',', dSep='.'):
    if s == None:
        return 0
    if not isinstance( s, str ):
        s = str( s )
    cnt = strspn( s, "-+ ")
    lhs = s[ 0:cnt ]
    s = s[ cnt: ]
    if dSep == '':
        cnt = -1
    else:
        cnt = s.rfind( dSep )
    if cnt > 0:
        rhs = dSep + s[ cnt+1: ]
        s = s[ :cnt ]
    else:
        rhs = ''
    return lhs + _splitThousandsHelper( s, tSep ) + rhs
#####################################
Alessandro Forghieri 14 years, 1 month ago  # | flag

Removed tail recursion (helper function not needed) handled leading signs and garbage differently. (By the way, where is strspn? I could not find it. Did not look real hard tho')

#####################################
# http://code.activestate.com/recipes/498181-add-thousands-separator-commas-to-formatted-number/
# Code from Michael Robellard's comment made 28 Feb 2010
# Modified for leading +, -, space on 1 Mar 2010 by Glenn Linderman
# 
# Tail recursion removed and  leading garbage handled on March 12 2010, Alessandro Forghieri

def splitThousands( s, tSep=',', dSep='.'):
    '''Splits a general float on thousands. GIGO on general input'''
    if s == None:
        return 0
    if not isinstance( s, str ):
        s = str( s )

    cnt=0
    numChars=dSep+'0123456789'
    ls=len(s)
    while cnt < ls and s[cnt] not in numChars: cnt += 1

    lhs = s[ 0:cnt ]
    s = s[ cnt: ]
    if dSep == '':
        cnt = -1
    else:
        cnt = s.rfind( dSep )
    if cnt > 0:
        rhs = dSep + s[ cnt+1: ]
        s = s[ :cnt ]
    else:
        rhs = ''

    splt=''
    while s != '':
        splt= s[ -3: ] + tSep + splt
        s = s[ :-3 ]

    return lhs + splt[ :-1 ] + rhs

#####################################
if __name__ == "__main__" :
    def doIt(s):
        print "%s\t=>\t%s"%(s,splitThousands(s,'!'))

    for i in [0,1,12,123,1234,12345,123456,1234567,12345678,123456789]:
        doIt(i)

    mant=0.987654321
    for i in [0,1,12,123,1234,12345,123456,1234567,12345678,123456789]:
        doIt(' + '+str(i+mant))
        doIt(-1*(i+mant))
mitch_feaster 12 years, 9 months ago  # | flag

random googlers: this is now built in to py3k (non-locale aware).

In [52]: '{0:,d}'.format(12345)
Out[52]: '12,345'