Welcome, guest | Sign In | My Account | Store | Cart

This program takes an input of a fixed width database output file with a header names, dashes, and data and converts it into CSV data. The code assumes that the dashes represent the fixed-column widths. For simplicity, all quotes are removed from data and all columns are wrapped with quotes.

Python, 46 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# Ian Maurer 
# http://itmaurer.com/
# Convert a Fixed Width file to a CSV with Headers
#
# Requires following format:
#
# header1      header2 header3
# ------------ ------- ----------------
# data_a1      data_a2 data_a3

def writerow(ofile, row):
    for i in range(len(row)):
        row[i] = '"' + row[i].replace('"', '') + '"'
    data = ",".join(row)
    ofile.write(data)
    ofile.write("\n")

def convert(ifile, ofile):
    header = ifile.readline().strip()
    while not header:
        header = ifile.readline().strip()

    hticks = ifile.readline().strip()
    csizes = [len(cticks) for cticks in hticks.split()]
    
    line = header
    while line:

        start, row = 0, []
        for csize in csizes:
            column = line[start:start+csize].strip()
            row.append(column)
            start = start + csize + 1

        writerow(ofile, row)
        line = ifile.readline().strip()

if __name__ == "__main__":
    import sys
    if len(sys.argv) == 3:
        ifile = open(sys.argv[1], "r")
        ofile = open(sys.argv[2], "w+")
        convert(ifile, ofile)
        
    else:
        print "Usage: python convert.py <input> <output>"

3 comments

Joe Steffl 15 years, 11 months ago  # | flag

To handle unicode I/O.

If we want to convert a utf-16 encoded file to a 'unicode-escape'  encoded file replace the following two lines:

ifile = open(sys.argv[1],"r")
ofile.write(data)

with these:
ifile = codecs.open(sys.argv[1], encoding='utf-16', errors='strict')
ofile.write(data.encode('unicode-escape'))
Scott Nichols 5 years, 6 months ago  # | flag

The fixed length script worked for me but had a few minor changes:

1) Removed all the .strip()'s at lines 19, 21, 31 and 36. This had to be done because changing the starting position. Not sure which strip was the culprit, but removing all them fixed this.

2) Added a .strip() the column data on line 32 to remove white spaces after each column is read: row.append(column.strip())

3) Changed the last line to print("Usage: python convert.py <input> <output>")

Thought I would share in case anyone else ran into this issue. --Scott

Scott Nichols 5 years, 6 months ago  # | flag

My last comment was hard to read so have rewritten it:

The fixed length to CSV script worked for me but had to make a few minor changes:

1) Removed all the .strip()'s at lines 19, 21, 31 and 36. This had to be done because one of these string strips was changing the starting position of the columns. I'm not sure which strip was the culprit, but removing all them fixed this.

2) Added a .strip() to the column data on line 32 to remove white spaces after each column is read: row.append(column.strip())

3) Changed the last line to print("Usage: python convert.py <input> <output>")

Thought I would share this in case anyone else ran into this issue. Other than the script works great. --Scott

Created by Ian Maurer on Tue, 15 Nov 2005 (PSF)
Python recipes (4591)
Ian Maurer's recipes (1)

Required Modules

Other Information and Tasks