Welcome, guest | Sign In | My Account | Store | Cart

This script parses an MS-Excel spreadsheet saved as XML.

The spreadsheet: a1, b1 a2, b2

will be stored like this:

[[u'a1', u'b1'], [u'a2', u'b2']]

Python, 33 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import sys
from xml.sax import saxutils
from xml.sax import parse

class ExcelHandler(saxutils.DefaultHandler):
    def __init__(self):
        self.chars=[]
        self.cells=[]
        self.rows=[]
        self.tables=[]
        
    def characters(self, content):
        self.chars.append(content)

    def startElement(self, name, atts):
        if name=="Cell":
            self.chars=[]
        elif name=="Row":
            self.cells=[]
        elif name=="Table":
            self.rows=[]
    
    def endElement(self, name):
        if name=="Cell":
            self.cells.append(''.join(self.chars))
        elif name=="Row":
            self.rows.append(self.cells)
        elif name=="Table":
            self.tables.append(self.rows)
    
excelHandler=ExcelHandler()
parse(sys.argv[1], excelHandler)
print excelHandler.tables

Empty cells get ignored.

3 comments

sasa sasa 16 years, 10 months ago  # | flag

I would like to do the contrary. I was wondering if there is a lybrary that makes it easier to generate XML excel sheets with python.

Stephen Chappell 15 years, 8 months ago  # | flag

Error.

Traceback (most recent call last):
  File "H:\XML_Parser.py", line 5, in ?
    class ExcelHandler(saxutils.DefaultHandler):
AttributeError: 'module' object has no attribute 'DefaultHandler'
Allan Anderson 15 years, 7 months ago  # | flag

Needs to be updated for current Python. Add "handler" to the "from xml.sax import" line.

Change "saxutils.DefaultHandler" to "handler.ContentHandler".

Then it works with Activestate's Python 2.4.