Welcome, guest | Sign In | My Account | Store | Cart

A variation on Recipe 578117 that can deal with cycles. A cycle means that a tree has itself as a subtree somewhere. A fold over such a data structure has a chicken-and-egg-problem: it needs its own result in order to construct its own result. To solve this problem, we let branch construct a part of its result before going into recursion. After the recursion, branch gets a chance to complete its result using its children's results. Python's support for coroutines (using yield) provides a nice way to define such a two-stage branch function.

Python, 28 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def identity(x):
    return x
    
def cyclist(tree):
    res = []
    reschildren = yield res
    res.extend(reschildren)
    yield

def foldcyc(tree,branch=cyclist,leaf=identity,shared=identity,getchildren=iter):
    mem = dict()
    def _fold(tree):
        if id(tree) in mem:
            return shared(mem[id(tree)])
        else:
            try:
                children = getchildren(tree)
            except:
                res = leaf(tree)
                mem[id(tree)] = res
                return res
            coroutine = branch(tree)
            res = coroutine.next()
            mem[id(tree)] = res
            reschildren = [_fold(child) for child in children]
            coroutine.send(reschildren)
            return res
    return _fold(tree)

Without additional arguments, foldcyc(tree) defaults to reconstructing the structure using lists, while sharing the leaves.

The two-stage approach can be seen in cyclist (no pun intended): before the recursive call into the children of tree, it first constructs an empty list as its own result, for its children to link to. This is passed back to foldsh using yield. After foldsh has finished the recursion, cyclist is re-entered and receives the results, which it uses to complete its own result.

Example
>>> tree = [42,[]]
>>> tree.append(tree)
>>> foldcyc(tree, leaf= lambda x: x+1)
[43, [], <Recursion on list with id=147258796>]
>>> id(_)
147258796
Converting between implicit and explicit representation

We can also represent the cyclic list structure using explicit pointers. A list then contains these references instead of sublists. The administration that maps pointers to lists is kept in a dictionary. The following two functions convert between the two representations:

def egraph(igraph):
    """
    Convert an implicit list graph to an explicit graph: a dict mapping
    node ids to a list of children node ids.
    """
    eg = {}
    ids = itertools.count()
    def branch(tree):
        resid = ids.next()
        reschildren = yield resid
        eg[resid] = reschildren
        yield
    foldcyc(igraph,branch=branch)
    return eg

def igraph(egraph,root):
    """
    Convert a dict, mapping node ids to a list of children node ids,
    into an implicit cyclic structure of lists.
    """
    def getchildren(key):
        return iter(egraph[key])
    return foldcyc(root,getchildren=getchildren)

Usage (using the tree defined above):

>>> egraph(tree)
{0: [42, 1, 0], 1: []}
>>> igraph(_,0)
[42, [], <Recursion on list with id=156110572>]
YAML example

PyYAML can load Python objects from YAML files using !tags, but this doesn't use the class's __init__ function. We define an alternative YAML representation of a Python object: a mapping with class and args keys. We use foldcyc to create (possibly cyclic) instances according to this representation.

import yaml    # from http://pyyaml.org/

def initcyc(tree):
    def branch(tree):
        obj = object.__new__(globals()[tree['class']])
        reschildren = yield obj
        obj.__init__(*reschildren)
        yield
    def getchildren(tree):
        return tree['args']
    return foldcyc(tree,branch,getchildren=getchildren)

class A(object):
    def __init__(self,spam,eggs):
        self.spam = spam
        self.eggs = eggs

class B(object):
    pass

y = yaml.load('''
&a
class: A
args:
- class: B
  args: []
- *a
''')

Let's check:

>>> a = initcyc(y)
>>> a
<tmp.A object at 0xa3baa2c>
>>> a.spam
<tmp.B object at 0xa3ba9ac>
>>> a.eggs
<tmp.A object at 0xa3baa2c>
Created by Sander Evers on Wed, 2 May 2012 (MIT)
Python recipes (4591)
Sander Evers's recipes (6)

Required Modules

  • (none specified)

Other Information and Tasks