Parseline breaks a line (actually a string) into python objects like strings, floats, ints, etc., based upon a short format string.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | def parseline(line,format):
"""\
Given a line (a string actually) and a short string telling
how to format it, return a list of python objects that result.
The format string maps words (as split by line.split()) into
python code:
x -> Nothing; skip this word
s -> Return this word as a string
i -> Return this word as an int
d -> Return this word as an int
f -> Return this word as a float
Basic parsing of strings:
>>> parseline('Hello, World','ss')
['Hello,', 'World']
You can use 'x' to skip a record; you also don't have to parse
every record:
>>> parseline('1 2 3 4','xdd')
[2, 3]
>>> parseline('C1 0.0 0.0 0.0','sfff')
['C1', 0.0, 0.0, 0.0]
"""
xlat = {'x':None,'s':str,'f':float,'d':int,'i':int}
result = []
words = line.split()
for i in range(len(format)):
f = format[i]
trans = xlat.get(f)
if trans: result.append(trans(words[i]))
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result
|
I have to parse the output of simulations codes all of the time, and this little recipe gets imported or pasted into most of my code at one time or another. The idea is to grab a line and then interpret the records based upon a little format string, which applies them to the words, returned by line.split().
I posted this recipe to c.l.python, and Fredrik Lundh posted a really great little variation:
def parseline(line, *types): return [c(x) for (x, c) in zip(line.split(), types) if c] or [None]
IMHO it loses some of the terseness of mine in application (it's amazing how many text records a line can have, and typing parseline(line,None,None,None,None,None,None,float) is much more tiring than parseline(line,'xxxxxxf'). But he wins points for pythonicity.