Welcome, guest | Sign In | My Account | Store | Cart

Sometimes you need strings to be true identifiers, for ex. to represent symbolic names. Smalltalk offers the type 'Symbol' for this purpose. In python, you need to test this explicitly. Here is a quick way.

Python, 12 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def AreStringsIdentifiers(*strings):
   try:
      class test(object): __slots__ = strings
   except TypeError:
      return False
   else:
      return True

if __name__ == '__main__':
   print
   print AreStringsIdentifiers('A', 'B') # -> True
   print AreStringsIdentifiers('A', '1B', 'x y') # -> False

This recipe takes advantage of the special treatment of the __slots__ attribute with new-style python classes. The actual test is performed by the python interpreter and raises a TypeError if any slot name is not a true python identifier. Need I say more?

Cheers and happy checking for identifiers!

4 comments

Alexander Semenov 18 years, 11 months ago  # | flag

It isn't very fast. In python to say "it is fast" you must measure. In my experiments this function:

reident = re.compile(r'^[a-zA-Z_]\w*$')
def AreStringsIdentifiers2(*strings):
    matcher = reident.match
    for s in strings:
        if matcher(s) is None: return False
    return True

is five times faster.

Paul Miller 18 years, 10 months ago  # | flag

Does it need to be fast? I can't see this function being a bottleneck in any reasonable program.

Zoran Isailovski (author) 18 years, 10 months ago  # | flag

Speed's not crusial here. Speed is not of such great importance here, as my pre-submitter noticed. I'll better remove the misleading part from the recipe.

Zoran Isailovski (author) 18 years, 10 months ago  # | flag

About credibility of measurements. "Don't take it for granted, do your own measurements" someone said. So I did:

  • When the very first string is not an identifier, the re-based code above is about 3.3 (not 5) times faster, and this factor does not seem to vary with the string length.

  • When the very last string is not an identifier, the re-based code is about 3.3 times SLOWER for 8-char-strings, getting even worse for shorter strings.

Usually, measurements are expected to increase the accuracy and credibility of a statement. But then, it depends on what and how you measure ...

Created by Zoran Isailovski on Fri, 6 May 2005 (PSF)
Python recipes (4591)
Zoran Isailovski's recipes (13)
HongxuChen's Fav (39)

Required Modules

  • (none specified)

Other Information and Tasks