Hey, I have a .txt file, I want to read the file and the output should be in .xml format. Any suggestions?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | Paper 1 / White Spaces are included
Single Correct Answer Type
1. Text of question 1
a) Option 1.a b) Option 1.b
c) Option 1.c d) Option 1.d
2. Text of question 2
a) This is an example of Option 2.a
b) Option 2.b has a special char α
c) Option 2.c
d) Option 2.d
3. Text of question 3
a) Option 3.a can span multiple
lines.
b) Option 3b
c) Option 3c
d) Option 3d
My code:
from lxml import etree
import csv
root = etree.Element('data')
#f = open('input1.txt','rb')
rdr = csv.reader(open("input1.txt",newline='\n'))
header = next(rdr)
for row in rdr:
eg = etree.SubElement(root, 'eg')
for h, v in zip(header, row):
etree.SubElement(eg, h).text = v
f = open(r"C:\temp\input1.xml", "w")
f.write(etree.tostring(root))
f.close()
I'm getting an error like:
Traceback (most recent call last):
File "E:\python3.2\input1.py", line 11, in <module>
etree.SubElement(eg, h).text = v
File "lxml.etree.pyx", line 2995, in lxml.etree.SubElement (src\lxml\lxml.etree.c:69677)
File "apihelpers.pxi", line 188, in lxml.etree._makeSubElement (src\lxml\lxml.etree.c:15691)
File "apihelpers.pxi", line 1571, in lxml.etree._tagValidOrRaise (src\lxml\lxml.etree.c:29249)
ValueError: Invalid tag name ' Paper 1'
|
And I want it to consider the white spaces also. I'm using Python 3.2. Any suggestions?
Tags: python3_1
well, if you want to change your .txt file data to a .xml file;
with open("filename.txt") as f: rd=f.readlines() with open ("newfile.xml","w") as v: for i in rd: print (i) v.write(i)
You can see lines in the .txt file, and it will create a newfile.xml and will put whatever in filename.txt
Those funny characters you are getting here are the UTF-8 BOM (byte order mark). Try
See the Codecs library [1] and Unicode HOWTO [2] for python.
[1] https://docs.python.org/2/library/codecs.html [2] https://docs.python.org/2/howto/unicode.html