Welcome, guest | Sign In | My Account | Store | Cart

This routine allows you to split a string on multiple characters. Additionally, you can specify the maximum number of elements to return. The final element will contain the remainder of the string.

Tcl, 131 lines
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
proc ParseString { variable_name string separators maximum } {
  # this routine makes parsing easy WHILE preserving
  # the "exactness" of the string by NOT treating it as a list...
  #
  # get ahold of an array to put results into
  upvar "1" $variable_name local_array

  # get a list of separators...
  set separator_list [split $separators ""]

  # get length in characters
  set count [string length $string]

  # start at first index (maybe make this variable later?)
  set index "0"

  # always start counting in result array from 1 
  # (should this really be zero?)
  set found_index "1"

  # how many "matches" did we find?
  #
  # NOTE: this will NOT be more than the parameter 
  #       maximum, if specified
  #
  set found_count "0"

  # current string that needs to be added when next 
  # separator is found...
  set found_string ""

  #
  # keep going until the end of the string is reached
  #
  while {$index < $count} {
    #
    # go through string on a character-by-character basis
    #
    set character [string index $string $index]
    #
    # if the character is in the separator list,
    # then we need to add to the array...
    #
    if {[lsearch -exact $separator_list $character] != "-1"} then {
      if {$maximum > "0"} then {
        #
        # we are limiting the number of "matches" to a certain amount
        # to allow for rather flexible argument parsing for callers...
        # (they can treat the first X arguments as separate, and the 
        # rest as one long argument)
        #
        if {$found_count == [expr {$maximum - "1"}]} then {
          # stop adding new after X matches... (last one is taken care 
          # of after loop)
          set do_add "0"
        } else {
          # we haven't reached the maximum yet
          set do_add "1"
        }
      } else {
        # there is no maximum
        set do_add "1"
      }
    } else {
      # we didn't find a separator yet
      set do_add "0"
    }

    if {$do_add != "0"} then {
      #
      # add string to found array...
      #
      set local_array($found_index) $found_string
      # next index in result array
      incr found_index
      # increase count of found arguments
      incr found_count
      # reset current string
      set found_string ""
    } else {
      #
      # otherwise, just keep appending to current string
      #
      if {$found_string != ""} then {
        # tack on the current character (this is not a separator)
        append found_string $character
      } else {
        # since no other characters in the current string yet, 
        # just set it
        set found_string $character
      }
    }

    incr index
  }

  #
  # don't forget last one... in case there is one...
  # (this should always happen if the string doesn't end in space...)
  #
  if {$found_string != ""} then {
    # add FINAL string to found array...
    set local_array($found_index) $found_string
    # next index in result array
    incr found_index
    # increase count to FINAL count of found arguments
    incr found_count
    # reset current string
    set found_string ""
  }

  #
  # pass back count always, even if no matches...
  #
  set local_array(count) $found_count

  #
  # NOTE: This should only return zero if there were 
  #       no characters in the string (because otherwise 
  #       we always found at least one element).
  #
  if {$found_count > "0"} then {
    # if we found anything, return non-zero
    set result "1"
  } else {
    # otherwise return zero
    set result "0"
  }

  return $result
}

This is really useful when you need to parse arguments contained in a string that is NOT a list. This routine could be improved in several ways. It could allow the caller to specify the number of elements to skip over, etc. More robust error handling could also be added.

1 comment

Ingemar Hansson 19 years, 2 months ago  # | flag

Enhancement. Hi!

Very useful routine indeed. But it contains one bug, it always returns 1 since the test if found_count is greater than 0 is placed after handling the leftover part of the string. This part increments found_count and it can never be 0.

But it is also a bit inefficient. I've written another proc which returns exactly the same result as the original one (but without the bug :).

Ingemar Hansson

Lund, Sweden

proc ParseString {variable_name string separators {maximum 0}} {
  # get ahold of an array to put results into
  upvar 1 $variable_name local_array

  # Split the string with the separators and count the elements
  set allsplit [split $string $separators]
  set allLen [llength $allsplit]

  # Assume that split characters existed in the string
  set result 1
  if {$allLen == 1} {
      # Nope, there were no split characters in the string
      set noOfElems 1
      set result 0
  } elseif {$maximum == 0 || $allLen &lt; $maximum} {
      # Use all list elements
      set noOfElems $allLen
  } else {
      # In case maximum is prepended with zeros or is a hex number, or ...
      # Of course some error handling could be added here
      set maximum [expr {$maximum}]
      set noOfElems $maximum
  }

  # Copy the list elements to the output array
  set ind 0
  while {[incr ind] &lt;= $noOfElems} {
      set local_array($ind) [lindex $allsplit [expr {$ind - 1}]]
  }

  # Fix last element of array if 'maximum' is greater than 0 and there
  # are more elements in splitted list than what 'maximum' says
  if {$maximum &gt; 0 &amp;&amp; $allLen &gt; $maximum} {
      # Join all characters in maximum-1 number of elements to a string
      set partsplit [join [lrange $allsplit 0 [expr {$maximum - 2}]] ""]
      # Calculate length of original string up to maximum-1 elements
      set used_len [expr {[string length $partsplit] + $maximum - 1}]
      # Copy rest of the original string into the last used array index
      set local_array($noOfElems) [string range $string $used_len end]
  }

  # pass back count always, even if no matches...
  set local_array(count) $noOfElems

  return $result
}
Created by Joe Mistachkin on Tue, 26 Feb 2002 (MIT)
Tcl recipes (162)
Joe Mistachkin's recipes (1)

Required Modules

  • (none specified)

Other Information and Tasks