Welcome, guest | Sign In | My Account | Store | Cart

Some other solutions to this problem have been posted, but most of them require lots of looping or many lines of code. Well, I've done up a couple of simple regular expressions that solve the en/detab problem. Enjoy.

Python, 31 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
import re

def entab(temp, tab_width=4, all=0):

        #if all is true, every time tab_width number of spaces are found next
        #to each other, they are converted to a tab.  If false, only those at
        #the beginning of the line are converted.  Default is false.

        if all:
                temp = re.sub(r" {" + `tab_width` + r"}", r"\t", temp)
        else:
                patt = re.compile(r"^ {" + `tab_width` + r"}", re.M)
                temp, count = patt.subn(r"\t", temp)
                i = 1
                while count > 0:
                        #this only loops a few times, at most six or seven times on
                        #heavily indented code
                        subpatt = re.compile(r"^\t{" + `i` + r"} {" + `tab_width` + r"}", re.M)
                        temp, count = subpatt.subn("\t"*(i+1), temp)
                        i += 1
        return temp


def detab(temp, tab_width=4):

        #this code is lazy, but it gets the job done and it's what I use there
        #is no need for a only convert tabs to spaces at beginning of line
        #since if you are taking out tabs it is usually more than just an
        #indentation problem
       
        return temp.expandtabs(tab_width)

I wrote this code because in the MacPython project we needed tools to do these types of converstions. The issue is that most Mac text editors do not support using spaces for tabs (or at least not neatly), so tabs are everywhere. But, if we get someone elses code and want to edit it, we need to convert all those spaces into tabs so that Python will read the whitespace correctly. This code is actually a modified bit stipped from my command line tool for use on OS X.

I chose regular expressions because they keep the code smaller. In fact, the longest part of this code is the comments that I added in to explain it to you when posting it for the Python Cookbook. A concise solution is always important to me.

The detab code could be done the similar to the entab code, but Python provides a function built in to do it, so I'm lazy and take what they give me. See comments for more info.

1 comment

Alex Martelli 22 years, 5 months ago  # | flag

entab is rather misleading. entab doesn't work on the columnar assumption normally implied by tabs (and respected by detab). It changes runs of spaces to tabs with no care about where the columns end up, which is just about useless (when substituting all tabs).