This function formats a block of text. The text is broken into tokens. (Whitespace is NOT preserved.) The tokens are reassembled at the specified level of indentation and line width. A string is returned.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | def format(text, indent=2, width=70):
"""
Format a text block.
This function formats a block of text. The text is broken into
tokens. (Whitespace is NOT preserved.) The tokens are reassembled
at the specified level of indentation and line width. A string is
returned.
Arguments:
`text` -- the string to be reformatted.
`indent` -- the integer number of spaces to indent by.
`width` -- the maximum width of formatted text (including indent).
"""
width = width - indent
out = []
stack = [word for word in text.replace("\n", " ").split(" ") if word]
while stack:
line = ""
while stack:
if len(line) + len(" " + stack[0]) > width: break
if line: line += " "
line += stack.pop(0)
out.append(" "*indent + line)
return "\n".join(out)
|
Tags: algorithms
Nice, but module 'textwrap' may be a better choice. Module textwrap implements this and more, http://docs.python.org/lib/module-textwrap.html
Silly me. testwrap.fill is equivalent to "\n".join( textwrap.join ), so maybe
A 'split' comment. Isn't
equivalent to
?
Oops. Yes, it is. Maybe I should read the manual, hm?
Yes, and... ...the line "if len(line) + len(" " + stack[0]) > width:" is a little wonky. By adding a " " to stack[0] you're allocating a whole new string and then throwing it away just to take the length. You should have said something like: "if len(line) + 1 + len(stack[0]) > width:" Or better yet, done something like this:
This avoids all the expensive string operations you used to build your lines, and the repeated calls to len(line) in your inner while loop.
Of course, you really should have just used the textwrap module, as Ori says.
why not use the .rjust function, just saying