To split a string more times, hierarchically.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | def nestedSplit(astring, sep=None, *subsep):
"""nestedSplit(astring, sep=None, *subsep): given astring, and one or more split
strings, it splits astring hierarchically. The first split key is the higher level one.
Ex.: nestedSplit("a b\nc d", "\n", " ") => [['a', 'b'], ['c', 'd']] """
if subsep:
return [nestedSplit(fragment, *subsep) for fragment in astring.split(sep)]
return astring.split(sep)
if __name__ == '__main__':
st = "a b\nc d"
print st
print nestedSplit(st, "\n", " ")
print
tetris = """\
....
.##.
.##.
....
####
####
..##
..##"""
from textwrap import dedent
tetris = dedent(tetris)
print tetris
print nestedSplit(tetris, "\n\n", "\n")
|
This is just a very basic implementation, it can be augmented/improved in many ways:
- removing the recursivity to speed it up a little;
- removing empty lines, filtering out unwanted things, etc;
- mapping a given function on the leaves of this tree of lists;
- adding another feature: alternative splitting strings, like (a sequence of alternative splitting strings can be given instead of a single string):
s = "a b\nc,d" multiSplit(s, "\n", [" ", ","]) ==> [['a', 'b'], ['c', 'd']]
V. 1.1: fixed a silly bug (originally the parameters were inverted), thank you Shane Holloway, next time I'll test things more. V. 1.2: changed its definition to a word present in the English language. V. 1.3: changed the name and improved the function on the base of Erik Wilsher code (but nestedSplit doesn't contain the not).
split[1:] vs split[-1:].
Notice the splits[1:] instead of splits[-1:] -- in case you want to split on more than three parameters.
Now you can correctly split::
gerarchically? Took me a moment, but I think the term you mean in the description is "hierarchically", not "gerarchically". The latter isn't a word in the English language.
It's not a coding issue, but fixing it might help people understand and find the recipe...
You could rearrange the argurment-list to have a mandatory first separator. If you have a mandatory first separator argument, you can better emulate the behaviour of string.split. In addition you avoid the slicing of the separator argument.
Mistake in comment above. Slight mistake in the comment above. The first separator arg is not mandatory, but has a default value.
Thank you. You solution is quite better than mine. I don't know if this deserves to become a standard string method. It's slick and elegant, and I use it now and then, but I don't know how often other people can use something like this.