ActiveState Code

Recipe 65226: Expanding and Compressing Tabs


You want to convert tabs in a string to the appropriate number of spaces, or vice versa.

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# of course, in Python, we do have a built-in tab expansion string method:
# method .expandtabs(tablen=8) of string objects!  If we *didn't* have
# it, though, here's how we might make it ourselves...:

# string processing tends to be faster in a split/process/rejoin
# approach than by repeated overall-string transformations, so...:
def expand_with_re(astring, tablen=8):
    import re
    pieces = re.split(r'(\t)', astring)
    lensofar = 0
    for i in range(len(pieces)):
        if pieces[i]=='\t':
            pieces[i] = ' '*(tablen-lensofar%tablen)
        lensofar += len(pieces[i])
    return ''.join(pieces)

# note we used re.split, rather than plain string splitting, because
# re.split with a '(group)' in the re gives us the splitters too,
# which is quite handy here for us to massage the pieces list into
# our desired form for the final ''.join.  However, '\t'.split,
# "interleaving" the blank joiners, looks a bit better still:
def expand(astring, tablen=8):
    result = []
    for piece in astring.split('\t'):
        result.append(piece)
        result.append(' '*(tablen-len(piece)%tablen))
    return ''.join(result[:-1])

# for the 'unexpanding', though, the "joiners" (spaces) are
# really crucial, so let's go back to the re approach (and
# _here_ we don't have a built-in method of strings...!):
def unexpand(astring, tablen=8):
    import re
    pieces = re.split(r'( +)', astring)
    lensofar = 0
    for i in range(len(pieces)):
        thislen = len(pieces[i])
        if pieces[i][0]==' ':
            numblanks = (lensofar+thislen)%8
            numtabs = (thislen-numblanks+7)/8
            pieces[i] = '\t'*numtabs + ' '*numblanks
        lensofar += thislen
    return ''.join(pieces)

Discussion

Inspired by Recipe 1.7 in O'Reilly's Perl Cookbook. Again we notice substantially the same power, packaged in a cryptic oneliner in Perl but in a few more-readable statements in Python. Python chooses to make tab expansion (a very frequent task) available in the easiest way, but 'unexpansion' (insertion of tabs to stand for spaces), a task that's not all that frequent and should maybe be discouraged (if you need to compress, there are far better ways:-), is omitted. Perl places both expand and unexpand in a standard package. Either approach is defensible, of course.

Sign in to comment