Welcome, guest | Sign In | My Account | Store | Cart
0

This word-wrap function flows paragraphs of text so they fit in a certain column width. It differs from similar methods in that it preserves existing whitespace such as newlines and runs of spaces.

Python, 45 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
def wrap(text, width):
    """
    A word-wrap function that preserves existing line breaks
    and most spaces in the text. Expects that existing line
    breaks are posix newlines (\n).
    """
    return reduce(lambda line, word, width=width: '%s%s%s' %
                  (line,
                   ' \n'[(len(line)-line.rfind('\n')-1
                         + len(word.split('\n',1)[0]
                              ) >= width)],
                   word),
                  text.split(' ')
                 )

# 2 very long lines separated by a blank line
msg = """Arthur:  "The Lady of the Lake, her arm clad in the purest \
shimmering samite, held aloft Excalibur from the bosom of the water, \
signifying by Divine Providence that I, Arthur, was to carry \
Excalibur. That is why I am your king!"

Dennis:  "Listen. Strange women lying in ponds distributing swords is \
no basis for a system of government. Supreme executive power derives \
from a mandate from the masses, not from some farcical aquatic \
ceremony!\""""

# example: make it fit in 40 columns
print(wrap(msg,40))

# result is below
"""
Arthur:  "The Lady of the Lake, her arm
clad in the purest shimmering samite,
held aloft Excalibur from the bosom of
the water, signifying by Divine
Providence that I, Arthur, was to carry
Excalibur. That is why I am your king!"

Dennis:  "Listen. Strange women lying in
ponds distributing swords is no basis
for a system of government. Supreme
executive power derives from a mandate
from the masses, not from some farcical
aquatic ceremony!"
"""

Most word-wrap functions assume that you have line breaks where you don't want them, so they completely reflow the text. I needed to do some simple word wrapping, and I also wanted to be able to preserve line breaks and runs of spaces that were already present.

This version (v1.5, 2004-10-12), the first update in over two years, incorporates a minor efficiency improvement from Matthias Urlichs.

Known issue: If a line needs to be wrapped in the middle of a run of spaces, there is a chance you will lose a space. This shouldn't matter in most situations, but I thought I'd mention it since I claimed that whitespace is preserved.

How does it work? Well, the reduce function takes each word and adds it to the text to be output, preceding the word with either a space or a linefeed, depending on whether the space + the word would make the current line of the output exceed the max width. Each word is found with a simple split on the space character, so a "word" may actually span multiple lines (which is why existing line breaks end up being preserved). In such cases, only the length of the word's first line is used in the overflow determination.

9 comments

Jørgen Cederberg 13 years, 11 months ago  # | flag

Nested scopes. Apperently you need to import nested scopes in Python versions before 2.2, i.e. from __future__ import nested_scopes.

Otherwise, this is an excellent function.

Mike Brown (author) 13 years, 10 months ago  # | flag

fixed. I've edited the recipe, adding "width=width" to the anonymous function's signature. This should work better than importing from __future__, as that would only work in Python 2.1.

Mike Brown (author) 13 years, 7 months ago  # | flag

Another fix. As noted above, until today, the function had a bug that sometimes caused premature line breaks. My apologies. It's fixed now, at a cost of some speed.

Matthias Urlichs 12 years, 5 months ago  # | flag

small inefficiency. The code

len(line[line.rfind('\n')+1:])

actually makes a temporary copy of the last word, only to throw it away immediately.

Better:

len(line)-line.rfind('\n')-1
Junyong Pan 11 years, 7 months ago  # | flag

another CJK supported unicode word-wrap function. http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/358117

Guido van Rossum 9 years, 10 months ago  # | flag

Don't use this for long inputs. It's code like this that gives reduce() a bad name. :-/ The code is both inscrutable and quadratic.

Thomas Guettler 8 years, 11 months ago  # | flag

textwrap is in the standard library. Hi,

Python has a textwrap module in the standard library:

http://docs.python.org/lib/module-textwrap.html

Dominick Saputo 5 years, 11 months ago  # | flag

How about something simple, line this. It's not a one-liner, but it's easy to read and understand and it's 6 times more efficient:

def wrap(text, width=80):
    lines = []
    for paragraph in text.split('\n'):
        line = []
        len_line = 0
        for word in paragraph.split(' '):
            len_word = len(word)
            if len_line + len_word <= width:
                line.append(word)
                len_line += len_word + 1
            else:
                lines.append(' '.join(line))
                line = [word]
                len_line = len_word + 1
        lines.append(' '.join(line))
    return '\n'.join(lines)
Mike Brown (author) 5 years, 1 month ago  # | flag

Thanks, all, for the comments and improvements over the years. If I ever need to reflow a lot of text again, I'll try Dominick Saputo's version or the stdlib textwrap (which my code predates).

Add a comment

Sign in to comment

Created by Mike Brown on Wed, 4 Sep 2002 (PSF)
Python recipes (4483)
Mike Brown's recipes (1)

Required Modules

  • (none specified)

Other Information and Tasks