Welcome, guest | Sign In | My Account | Store | Cart

This function formats a block of text. The text is broken into tokens. (Whitespace is NOT preserved.) The tokens are reassembled at the specified level of indentation and line width. A string is returned.

Python, 25 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def format(text, indent=2, width=70):
    """
    Format a text block.
    
    This function formats a block of text. The text is broken into
    tokens. (Whitespace is NOT preserved.) The tokens are reassembled
    at the specified level of indentation and line width.  A string is
    returned.

    Arguments:
        `text`   -- the string to be reformatted.
        `indent` -- the integer number of spaces to indent by.
        `width`  -- the maximum width of formatted text (including indent).
    """
    width = width - indent
    out = []
    stack = [word for word in text.replace("\n", " ").split(" ") if word]
    while stack:
        line = ""
        while stack:
            if len(line) + len(" " + stack[0]) > width: break
            if line: line += " "
            line += stack.pop(0)
        out.append(" "*indent + line)
    return "\n".join(out)

6 comments

Ori Peleg 17 years, 7 months ago  # | flag

Nice, but module 'textwrap' may be a better choice. Module textwrap implements this and more, http://docs.python.org/lib/module-textwrap.html

import textwrap
def format(text, indent=2, width=70):
  return "\n".join( textwrap.wrap(text, width=width, initial_indent=" "*indent, subsequent_indent=" "*indent) )
Ori Peleg 17 years, 7 months ago  # | flag

Silly me. testwrap.fill is equivalent to "\n".join( textwrap.join ), so maybe

import textwrap
def format(text, indent=2, width=70):
    return textwrap.fill(text, width=width, initial_indent=" "*indent, subsequent_indent=" "*indent)
Ori Peleg 17 years, 7 months ago  # | flag

A 'split' comment. Isn't

stack = text.split()

equivalent to

stack = [word for word in text.replace("\n", " ").split(" ") if word]

?

Alexander Ross (author) 17 years, 7 months ago  # | flag

Oops. Yes, it is. Maybe I should read the manual, hm?

Simon Forman 17 years, 7 months ago  # | flag

Yes, and... ...the line "if len(line) + len(" " + stack[0]) > width:" is a little wonky. By adding a " " to stack[0] you're allocating a whole new string and then throwing it away just to take the length. You should have said something like: "if len(line) + 1 + len(stack[0]) > width:" Or better yet, done something like this:

def format(text, indent=2, width=70):
    width = width - indent
    out = []

    # Make a generator to yield words and lengths.
    # Add 1 to the lengths to account for spaces.
    gen = ((len(word) + 1, word) for word in text.split())

    line = []
    line_len = -1 # adjust for space for 1st word.

    for wlength, word in gen:

        # Add word length (plus space length) to line length.
        line_len += wlength

        # Check if we've filled a line.
        if  line_len > width:

            # Build and append one line.
            out.append(" " * indent + " ".join(line))

            # Set line and length to word and length.
            line = [word]
            line_len = wlength - 1

        # If not, keep adding words to the line list.
        else:
            line.append(word)

    return "\n".join(out)

This avoids all the expensive string operations you used to build your lines, and the repeated calls to len(line) in your inner while loop.

Of course, you really should have just used the textwrap module, as Ori says.

Glen 12 years, 10 months ago  # | flag

why not use the .rjust function, just saying

Created by Alexander Ross on Thu, 31 Aug 2006 (PSF)
Python recipes (4591)
Alexander Ross's recipes (2)

Required Modules

  • (none specified)

Other Information and Tasks