Welcome, guest | Sign In | My Account | Store | Cart

Parseline breaks a line (actually a string) into python objects like strings, floats, ints, etc., based upon a short format string.

Python, 35 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
def parseline(line,format):
    """\
    Given a line (a string actually) and a short string telling
    how to format it, return a list of python objects that result.

    The format string maps words (as split by line.split()) into
    python code:
    x   ->    Nothing; skip this word
    s   ->    Return this word as a string
    i   ->    Return this word as an int
    d   ->    Return this word as an int
    f   ->    Return this word as a float

    Basic parsing of strings:
    >>> parseline('Hello, World','ss')
    ['Hello,', 'World']

    You can use 'x' to skip a record; you also don't have to parse
    every record:
    >>> parseline('1 2 3 4','xdd')
    [2, 3]

    >>> parseline('C1   0.0  0.0 0.0','sfff')
    ['C1', 0.0, 0.0, 0.0]
    """
    xlat = {'x':None,'s':str,'f':float,'d':int,'i':int}
    result = []
    words = line.split()
    for i in range(len(format)):
        f = format[i]
        trans = xlat.get(f)
        if trans: result.append(trans(words[i]))
    if len(result) == 0: return None
    if len(result) == 1: return result[0]
    return result

I have to parse the output of simulations codes all of the time, and this little recipe gets imported or pasted into most of my code at one time or another. The idea is to grab a line and then interpret the records based upon a little format string, which applies them to the words, returned by line.split().

I posted this recipe to c.l.python, and Fredrik Lundh posted a really great little variation:

def parseline(line, *types): return [c(x) for (x, c) in zip(line.split(), types) if c] or [None]

IMHO it loses some of the terseness of mine in application (it's amazing how many text records a line can have, and typing parseline(line,None,None,None,None,None,None,float) is much more tiring than parseline(line,'xxxxxxf'). But he wins points for pythonicity.

Created by Rick Muller on Mon, 26 Feb 2007 (PSF)
Python recipes (4591)
Rick Muller's recipes (10)

Required Modules

  • (none specified)

Other Information and Tasks