This recipe shows a way to generate section numbers for a nested document structure. It can be used within a recursive algorithm to build a table of contents.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | # SectionPath.py
"""This module provides a simple utility to follow the path to an item in a
recursive tree structure. It can be used to build the section numbers for a
table of contents.
Instances of the SectionPath class are immutable and do not need to be
instantiated directly. Instead, use the "start" instance to begin the
descent into your tree.
Example:
>>> import SectionPath
>>> SectionPath.start
<SectionPath.SectionPath object at 0x4022580c>
>>> print SectionPath.start
1
>>> print SectionPath.start.sibling
2
>>> print SectionPath.start.sibling.child
2.1
>>> print SectionPath.start.sibling.child.child
2.1.1
>>> print SectionPath.start.sibling.child.child.sibling
2.1.2
>>> print SectionPath.start.sibling.child.child.sibling.sibling
2.1.3
See the bottom of this file for a real example of this module's use.
"""
class SectionPath(object):
"""Represents a path through the sections of a document tree."""
def __init__(self, init_path):
self.path = init_path
def child(self):
return SectionPath((1, self.path))
child = property(child)
def sibling(self):
head, tail = self.path
return SectionPath((head + 1, tail))
sibling = property(sibling)
def __str__(self):
result = []
path = self.path
while path is not None:
head, tail = path
result.append(str(head))
path = tail
result.reverse()
return '.'.join(result)
# Singleton instance. Use this to start your descent.
start = SectionPath((1, None))
if __name__ == '__main__':
import SectionPath
# Create a recursive tree structure. Each node in the tree is represented
# by a (string, node list) tuple. In a more realistic scenario, objects
# representing the content of the section could be stored instead of just
# titles.
tree = [
("About me", [
("My life", []),
("My cats", []),
("My lanta", []),
]),
("Portfolio", [
("Art", []),
("Music", [
("Vocal", []),
("Instrumental", []),
]),
("Poetry", [
("Bad", [])
]),
]),
("Conclusion", [
("Recursion is hard", []),
("Python makes it easier", []),
]),
# etc.
]
def print_toc(tree):
def recurse(sections, path):
for section in sections:
# Unpack each tree node as (string, list) tuple.
title, children = section
# Print the current section number and its title.
# Converting "path" to a string yields the section number.
print '%s. %s' % (path, title)
# Call this function recursively for each child, passing the
# "child" property of the path.
recurse(children, path.child)
# Advance the path to the next sibling.
path = path.sibling
# Call the recursive function, using the "start" path to begin
# generating section numbers starting at "1".
recurse(tree, SectionPath.start)
# Pass the recursive tree structure into the table of contents printer.
print_toc(tree)
# Expected output:
#
# 1. About me
# 1.1. My life
# 1.2. My cats
# 1.3. My lanta
# 2. Portfolio
# 2.1. Art
# 2.2. Music
# 2.2.1. Vocal
# 2.2.2. Instrumental
# 2.3. Poetry
# 2.3.1. Bad
# 3. Conclusion
# 3.1. Recursion is hard
# 3.2. Python makes it easier
|
Recursive algorithms are sometimes hard to write, and it often helps to use good abstractions because you don't have to keep as much information in your head at once. This recipe shows how the abstraction of one concept, section numbering, allows you to minimize its impact on the implementation of a larger concept, building a table of contents.
The section path is the path one must take to get to a particular item in the tree. For instance, a section number of "1.2.4" actually implies a sequence of steps: find section 1, find subsection 2, find sub-subsection 4. Internally, paths are stored as a linked list structure built out of tuples. In this case, the path would be (4, (2, (1, None))). Each time a child element (a subsection) is visited, a new tuple is created that wraps around the existing path: (1, (4, (2, (1, None)))). When a sibling (the next item at the current level) is visited, the number at the head of the list is incremented: (2, (4, (2, (1, None)))). Once a string representation of the path is needed, the linked list is converted to a list of strings, reversed, and joined with a '.', forming, in the above example, 1.2.4.2.
This recipe takes advantage of several powerful features of the Python language. By using Python 2.3's property() builtin, I was able to reduce the syntax to simple dot-notation. Manipulation of the linked list and section tree was facilitated by "tuple unpacking", which allows you to easily extract multiple values from a tuple or other sequence. Nested functions allowed me to hide the recursive helper function, recurse(), from the user of print_toc(). Finally, the use of Python's built-in list and tuple data types allowed me to represent the table of contents tree directly in Python with a minimal amount of fuss.