I use this for configuration. I hadn't intended to put it up anywhere, but there have been a couple discussions lately about converting XML to python dicts, so I feel obligated to share another approach, one that is based on Fredrik Lundh's ElementTree.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | import cElementTree as ElementTree
class XmlListConfig(list):
def __init__(self, aList):
for element in aList:
if element:
# treat like dict
if len(element) == 1 or element[0].tag != element[1].tag:
self.append(XmlDictConfig(element))
# treat like list
elif element[0].tag == element[1].tag:
self.append(XmlListConfig(element))
elif element.text:
text = element.text.strip()
if text:
self.append(text)
class XmlDictConfig(dict):
'''
Example usage:
>>> tree = ElementTree.parse('your_file.xml')
>>> root = tree.getroot()
>>> xmldict = XmlDictConfig(root)
Or, if you want to use an XML string:
>>> root = ElementTree.XML(xml_string)
>>> xmldict = XmlDictConfig(root)
And then use xmldict for what it is... a dict.
'''
def __init__(self, parent_element):
if parent_element.items():
self.update(dict(parent_element.items()))
for element in parent_element:
if element:
# treat like dict - we assume that if the first two tags
# in a series are different, then they are all different.
if len(element) == 1 or element[0].tag != element[1].tag:
aDict = XmlDictConfig(element)
# treat like list - we assume that if the first two tags
# in a series are the same, then the rest are the same.
else:
# here, we put the list in dictionary; the key is the
# tag name the list elements all share in common, and
# the value is the list itself
aDict = {element[0].tag: XmlListConfig(element)}
# if the tag has attributes, add those to the dict
if element.items():
aDict.update(dict(element.items()))
self.update({element.tag: aDict})
# this assumes that if you've got an attribute in a tag,
# you won't be having any text. This may or may not be a
# good idea -- time will tell. It works for the way we are
# currently doing XML configuration files...
elif element.items():
self.update({element.tag: dict(element.items())})
# finally, if there are no child tags and no attributes, extract
# the text
else:
self.update({element.tag: element.text})
|
This uses two simple classes to provide the machinery for XML conversion. See the comments and usage in the code for detailed explanation.
I make constant use of ElementTree, but my efforts are inexpert as best. If anyone can share how to make its use more elegant, I'd love to see...
Update: Fredrik Lundh was kind enough to take a look at this recipe and offer suggestions for cleaning it up. These have been implemented and my thanks goes out to him.
I am using this recipe and I found an issue with it, an XML file that contains multiple children with the same name will end up having one key in the dictionary, the value of which will be the value taken from the LAST child (with that name) that was processed.
This XML file will result in a dictionary with a single key called "Purchased" and the values associated with it will be {'PurchaseId': 'ccc1', 'PurchaseDate': 'ccc2', 'PurchaseOrigin': 'ccc3'}
Example:
Here is the change that I made in the XmlDictConfig class, the additional code is separated from the original one with empty lines:
What it does is the following:
In the end, all the nodes will be accessible as elements of a list that corresponds to a key in the dictionary.
p.s. my code is not very pythonic.
This is a prettier approach - the list if children is built using list comprehension.
Ok this code seems to be pretty referenced. I think is worth saying that you might use
from xml.etree import ElementTree
I found that this wasn't working correctly when there is one child to an element with attributes AND text, which you say in the comments. To fix this I changed the following:
hope this helps someone else if they need it, the text for the element will be in the dict with the attributes under the key __Content__ .
-Laserath
I've using yours but doesn't work well with lists. What about this solution?
Recursive, easier and less code:
Luis Martin Gil luismartingil.com
I've appended the following to XmlListConfig:
For when we have a list of items but they only have attributes and no inner text.