Welcome, guest | Sign In | My Account | Store | Cart

Another recipe to convert xml file into a python dictionary. This recipe uses lxml

Python, 45 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
from lxml import etree


def dictlist(node):
	res = {}
	res[node.tag] = []
	xmltodict(node,res[node.tag])
	reply = {}
	reply[node.tag] = {'value':res[node.tag],'attribs':node.attrib,'tail':node.tail}
	
	return reply

def xmltodict(node,res):
	rep = {}
	
	if len(node):
		#n = 0
		for n in list(node):
			rep[node.tag] = []
			value = xmltodict(n,rep[node.tag])
			if len(n):
			
				value = {'value':rep[node.tag],'attributes':n.attrib,'tail':n.tail}
				res.append({n.tag:value})
			else :
				
				res.append(rep[node.tag][0])
			
	else:
		
		
		value = {}
		value = {'value':node.text,'attributes':node.attrib,'tail':node.tail}
		
		res.append({node.tag:value})
	
	return 
		
def main():
	tree = etree.parse('test2.xml')
	res = dictlist(tree.getroot())
	
	
if __name__ == '__main__' :
	main()

If you pass a xml file like this one <?xml version="1.0"?> <elements idc="002"> <element idl="0001"> <singleelem id="000"/> bbbb </element> <dragon> <moredragon type="fire" color="red"> Enter the dragon </moredragon> </dragon> </elements>

It will result in a dictionary like this

{'elements': {'attribs': {'idc': '002'}, 'tail': None, 'value': [{'element': {'attributes': {'idl': '0001'}, 'tail': '\n', 'value': [{'singleelem': {'attributes': {'id': '000'}, 'tail': '\nbbbb\n', 'value': None}}]}}, {'dragon': {'attributes': {}, 'tail': '\n', 'value': [{'moredragon': {'attributes': {'color': 'red', 'type': 'fire'}, 'tail': '\n', 'value': '\n\tEnter the dragon\n'}}]}}]}}

Each node is represented as a key/value pair, like this

{node.tag:{value: text or child nodes, tail: tail of the node, attributes: dict of node attributes}}

The node tag becomes the key. The value to the node key contains a list of further nodes or the node text when the node is a leaf. Node key value also contains node attributes and node tail.

2 comments

Paddy McCarthy 13 years, 7 months ago  # | flag

XML in comment gobbled. Hi, You need to escape your XML as it did not show up.

  • Paddy.
Vivek Khurana (author) 13 years, 7 months ago  # | flag

How do I escape the xml. How do I escape the xml ? I tried putting xml in pre or code block but no luck.

Created by Vivek Khurana on Tue, 15 Apr 2008 (PSF)
Python recipes (4591)
Vivek Khurana's recipes (1)

Required Modules

Other Information and Tasks