How to parse lines containing comma-separated values.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | proc csv2list {str {sepChar ,}} {
regsub -all {(\A\"|\"\Z)} $str \0 str
set str [string map [list $sepChar\"\"\" $sepChar\0\" \"\"\"$sepChar \"\0$sepChar \"\" \" \" \0 ] $str]
set end 0
while {[regexp -indices -start $end {(\0)[^\0]*(\0)} $str -> start end]} {
set start [lindex $start 0]
set end [lindex $end 0]
set range [string range $str $start $end]
set first [string first $sepChar $range]
if {$first >= 0} {
set str [string replace $str $start $end [string map [list $sepChar \1] $range]]
}
incr end
}
set str [string map [list $sepChar \0 \1 $sepChar \0 {} ] $str]
return [split $str \0]
}
proc list2csv {list {sepChar ,}} {
set out ""
foreach l $list {
set sep {}
foreach val $l {
if {[string match "*\[\"$sepChar\]*" $val]} {
append out $sep\"[string map [list \" \"\"] $val]\"
} else {
append out $sep$val
}
set sep $sepChar
}
append out \n
}
return $out
}
|
A record of a csv file (comma-separated values, as exported e.g. by Excel) is a set of ascii values separated by "," (for other languages it may be ";" however, although this is not important for this case).
If a value contains itself the separator ",", then it (the value) is put between "".
If a value contains ", it is replaced by "".
The following record for example is parsed as follows:
<pre>123,"123,521.2","Mary says ""Hello, I am Mary"""</pre>- 123 - 123,521.2 - Mary says "Hello, I am Mary"
CSV Routines are part of TclLib as of 1.0. Starting with version 1.0 the TclLib (http://tcllib.sf.net) contains routines to handle conversions to and from CSV, including reading and writing to channels.