Welcome, guest | Sign In | My Account | Store | Cart

This module takes a list of equal length lists and converts it into XML.

If the first sublist is a list of headings, these are used to form the element names of the rest of the data, or these can be defined in the function call. Root and "row" elements can be named if required.

Python, 160 lines
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
#LL2XML.py
"""
See http://www.outwardlynormal.com/python/ll2XML.htm for full documentation.

This module converts a list of lists into xml
(e.g.a parsed comma separated values file or whatever).
With the proper arguments, the XML output will be an HTML table.
(See the test function for an example.)

If you want to use a csv as input, you will first need to get
hold of a csv parser to create the list of lists.
Examples include those at:
http://tratt.net/laurie/python/asv/
and
http://www.object-craft.com.au/projects/csv/
"""

# set up exceptions
class Error(Exception):
    def __init__(self, errcode,  heading_num = 0, sublist_length = 0):
        self.errcode = errcode
        if self.errcode == "Length Error - Sublists":
            self.message = ["All the sublists must be of uniform length."]
        elif self.errcode == "Heading Error - Empty Item":
            self.message = ["There is at least one empty heading item.\n",
                       "Please supply only non-empty headings."]
        elif self.errcode == "Heading Error - heading/sublist missmatch":
            self.message = ["Number of headings=",`heading_num`, "\n",
                          "Number of elements in sublists=", `sublist_length`, "\n",
                          "These numbers must be equal."]
            print self.message
        else: self.message = ""
        self.errmsg = "".join(self.message)
        
    def __str__(self):
        return (self.errmsg)
    pass

def escape(s):
    """Replace special characters '&', "'", '<', '>' and '"' by XML entities."""
    s = s.replace("&", "&amp;") # Must be done first!
    s = s.replace("'", "&apos;")
    s = s.replace("<", "&lt;")
    s = s.replace(">", "&gt;")
    s = s.replace('"', "&quot;")
    return s

def cleanString(s, ident):
    if type(s) != type(""):
        s = `s`
    s = escape(s)
    if ident == "tag":
        s = s.lower()
        s = s.replace(" ", "_")
    return s

def LL2XML(LL,headings_tuple = (), root_element = "rows", row_element = "row", xml_declared = "yes"):
    if headings_tuple == "table":
        td_list = []
        for item in LL[0]:
            td_list.append("td")
        headings_tuple = tuple(td_list)
        root_element = "table"
        row_element = "tr"
        xml_declared = "no"
        
    root_element = cleanString(root_element, "tag")
    row_element = cleanString(row_element, "tag")
    if headings_tuple  == (): 
        headings = [cleanString(s,"tag") for s in LL[0]]
        LL = LL[1:]         # remove now redundant heading row
    else:
        headings = [cleanString(s,"tag") for s in headings_tuple]
        
    # Sublists all of the same length?
    if ['!' for sublist in LL if len(sublist) != len(LL[0])]:
        raise Error("Length Error - Sublists")
        
    #check headings
    heading_num = len(headings)
    if heading_num != len(LL[0]):
        raise Error("Heading Error - heading/sublist missmatch", heading_num, len(LL[0]))
    
    for item in headings:
        if not cleanString(item,"heading"):
            raise Error("Heading Error - Empty Item")
        else:
            pass
    
    # Do the conversion
    xml = ""
    if xml_declared == "yes":
        xml_declaration = '<?xml version="1.0" encoding="iso-8859-1"?>\n'
    else:
        xml_declaration = ""
    bits = []
    add_bit = bits.append
    add_bit(xml_declaration)
    add_bit('<')
    add_bit(root_element)
    add_bit('>')
    for sublist in LL:
        add_bit("\n  <")
        add_bit(row_element)
        add_bit(">\n")
        i = 0
        for item in sublist:
            tag = headings[i]
            item = cleanString(item, "item")
            add_bit("    <")
            add_bit(tag)
            add_bit(">")
            add_bit(item)
            add_bit("</")
            add_bit(tag)
            add_bit(">\n")
            i = i+1
        add_bit("  </")
        add_bit(row_element)
        add_bit(">")
    add_bit("\n</")
    add_bit(root_element)
    add_bit(">")
    xml = "".join(bits)
    return xml

def test():
    LL = [['Login', 'First Name', 'Last Name', 'Job', 'Group', 'Office', 'Permission'],
           ['auser', 'Arnold', 'Atkins', 'Partner', 'Tax', 'London', 'read'],
           ['buser', 'Bill', 'Brown', 'Partner', 'Tax', 'New York', 'read'],
           ['cuser', 'Clive', 'Cutler', 'Partner', 'Management', 'Brussels', 'read'],
           ['duser', 'Denis', 'Davis', 'Developer', 'ISS', 'London', 'admin'],
           ['euser', 'Eric', 'Ericsson', 'Analyst', 'Analysis', 'London', 'admin'],
           ['fuser', 'Fabian', 'Fowles', 'Partner', 'IP', 'London', 'read']]
        
    LL_no_heads = [['auser', 'Arnold', 'Atkins', 'Partner', 'Tax', 'London', 'read'],
                    ['buser', 'Bill', 'Brown', 'Partner', 'Tax', 'New York', 'read'],
                    ['cuser', 'Clive', 'Cutler', 'Partner', 'Management', 'Brussels', 'read'],
                    ['duser', 'Denis', 'Davis', 'Developer', 'ISS', 'London', 'admin'],
                    ['euser', 'Eric', 'Ericsson', 'Analyst', 'Analysis', 'London', 'admin'],
                    ['fuser', 'Fabian', 'Fowles', 'IP', 'Partner', 'London', 'read']]

    #Example 1
    print "Example 1: Simple case, using defaults.\n"
    print LL2XML(LL)
    print "\n"
        
    #Example 2
    print """Example 2: LL has its headings in the first line, and we define our root and row element names.\n"""
    print LL2XML(LL,(),"people","person")
    print "\n"
    
    #Example 3
    print """Example 3: headings supplied using the headings argument(tuple), using default root and row element names.\n"""
    print LL2XML(LL_no_heads,("Login","First Name","Last Name","Job","Group","Office","Permission"))
    print "\n"
    
    #Example 4
    print """Example 4: The special case where we ask for an HTML table as output by just giving the string "table" as the second argument.\n"""
    print LL2XML(LL,"table")

Parsers of tabular data or comma separated values (csv) files will usually output a list of lists. Converting these to XML allows them to be manipulated with XSLT and other XML tools.

9 comments

Julius Welby (author) 22 years, 8 months ago  # | flag

Be sure to use the source from the text source link above. The source code contains entities which are interpreted by web browsers, so these do not display correctly on this page.

The text source is fine.

Julius Welby (author) 22 years, 8 months ago  # | flag

Escaped commas. I've just found out that the way commas are escaped in csv is to put the whole data element in quotes, not just the comma. (A much less sensible idea, IMHO). This means that the posted version will not work where one or more commas are present in the csv values.

Sorry about that. If anyone wants to post a fix, I'd be grateful. Otherwise, I'll fix and update this very shortly.

Hamish Lawson 22 years, 8 months ago  # | flag

Use existing module to parse CSV? For the job of parsing CSV you may want to use one of the existing modules available at the Vaults of Parnassus:

http://www.vex.net/parnassus/apyllo.py?find=csv

They will probably already have solved the problem of dealing properly with commas and quotes.

Julius Welby (author) 22 years, 8 months ago  # | flag

I should have looked there first, shouldn't I? Yes, you are quite right. Well that's saved me a bit of work!

I'll simplify the script to just accept the pre-parsed data and repost the module shortly. Thanks!

Julius Welby (author) 22 years, 8 months ago  # | flag

csv to XML? Pah! I convert a list of lists! OK, I've slashed and burned and rejigged, and now the module seems to work fine. Send it a well formed list of lists (sublists all of equal length) and it will spit XML at you.

As before, use the plain text link, as the script as it appears on this page is broken.

Now I suppose it will be rejected for being too long.

Ah well.

Julius Welby (author) 22 years, 8 months ago  # | flag

Tidied it up. OK, I've stripped out the documentation (it's on my web site), used the Exception class to create my exceptions, and used the append method throughout, rather than use string manipulation (it's quicker).

One nice additional feature: If you do LL2XML.LL2XML(LL,"table") - where LL is a list of equal length lists - the script returns an HTML table.

(As before, use the text source, or download from my site.)

Julius Welby (author) 21 years, 8 months ago  # | flag

Added a few minor fixes. See http://www.outwardlynormal.com/python/ll2XML.htm for details.

Frank Jacques 21 years, 8 months ago  # | flag

Just a minor thing. What about:

LL_no_heads = LL[1:]

It should save some typing... :)

Julius Welby (author) 21 years, 8 months ago  # | flag

Yes. Quite right. The version of this script in the Cookbook book has this and several other improvements. I will synch the two versions soon. Out of interest, I will be checking the effect on performance of the much more compact code in the published version.

Unfortunately the O'Reilly published version has problems. Two of the entity replacements are broken. It looks like the editor didn't use the source code, but copied and pasted from a browser.

I'm trying to get this fixed in their online version of the module. Too late for the printed version, I expect (I've not seen it yet).

Ah well, good to be in there, anyway.

Created by Julius Welby on Sun, 8 Jul 2001 (PSF)
Python recipes (4591)
Julius Welby's recipes (2)
Python Cookbook Edition 1 (103)

Required Modules

  • (none specified)

Other Information and Tasks