A portable class to carry out all sorts of validation on strings. It uses regular expressions to carry out common validation procedures.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 | import re
true = 1
false = 0
class StringValidator:
RE_ALPHA = None
RE_ALPHANUMERIC = None
RE_NUMERIC = None
RE_EMAIL = None
validateString = ""
_patterns = {}
def __init__(self, validateString):
self.validateString = validateString
def isAlpha(self):
if not self.__class__.RE_ALPHA:
self.__class__.RE_ALPHA = re.compile("^\D+$")
return self.checkStringAgainstRe(self.__class__.RE_ALPHA)
def isAlphaNumeric(self):
if not self.__class__.RE_ALPHANUMERIC:
self.__class__.RE_ALPHANUMERIC = re.compile("^[a-zA-Z0-9]+$")
return self.checkStringAgainstRe(self.__class__.RE_ALPHANUMERIC)
def isNumeric(self):
if not self.__class__.RE_NUMERIC:
self.__class__.RE_NUMERIC = re.compile("^\d+$")
return self.checkStringAgainstRe(self.__class__.RE_NUMERIC)
def isEmail(self):
if not self.__class__.RE_EMAIL:
self.__class__.RE_EMAIL = re.compile("^.+@.+\..{2,3}$")
return self.checkStringAgainstRe(self.__class__.RE_EMAIL)
def isEmpty(self):
return self.validateString == ""
def definePattern(self, re_name, re_pat):
self._patterns[re_name] = re_pat
def isValidForPattern(self, re_name):
if self._patterns.has_key(re_name):
if type(self._patterns[re_name]) == type(''):
self._patterns[re_name] = re.compile(self._patterns[re_name])
return self.checkStringAgainstRe(self._patterns[re_name])
else:
raise KeyError, "No pattern name '%s' stored."
# this method should be considered to be private (not be be used via interface)
def checkStringAgainstRe(self, regexObject):
if regexObject.search(self.validateString) == None:
return false
return true
# example usage
sv1 = StringValidator("joe@testmail.com")
sv2 = StringValidator("rw__343")
if sv1.isEmail(): print sv1.validateString + " is a valid e-mail address"
else: print sv1.validateString + " is not a valid e-mail address"
if sv2.isAlphaNumeric(): print sv2.validateString + " is a valid alpha-numeric string"
else: print sv2.validateString + "i is not a valid alpha-numeric string"
# note, this is basically the same as the e-mail checker, just it shows
# how to do a custom re
sv2.definePattern("custom_email", "^.+@.+\..{2,3}$")
if sv1.isValidForPattern("custom_email"): print sv1.validateString + " is a valid e-mail address"
else: print sv1.validateString + " is a invalid e-mail address"
|
Programmers often put a validation engine inside of each individual component they make (ie.Form, Emailer, etc). That can cause maintainability and consistancy problems. A much better way to do it is to have a general validation object that can be delegated the task of performing validation. That is exactly what this class attempts to provide
Why compile Rx's each time? Why not implement lazy caching of the actual objects produced by compile() and keep them in the validator instance?
thanks. Thanks for the suggestion. That makes alot of sense. Its definately more efficient. I have made the change to the code, let me know if thats what you were thinking of.
actually.. Actually the other way would be more efficient if I know I was just going to use the "is" only once or twice, because it would only compile the Rx's it needs. However, assuming that we will call this "is" methods many times, the new way is more efficient because the Rx's wouldn't be compiled every time.
Either way, I think the new way is better design.
Lazy Rx compilations. So we could combine the ideas and reuse compiled regular expressions, but only compile them as they're needed. Each time a validity-testing method is invoked, just check to see whether the pattern has been compiled, if so use it, if not compile it but save it. I would also add support for arbitrary patterns stored by name that could be added at the class level. One last note, if you want to make the internal service method private, append "__" to the beginning of it's name.
Ok.. I implemented lazy compilations. Please review my code. Is there any way that I can directly detect whether a variable is a compiled re object? I tried "isinstance".. but it didn't seem to work.
In other words, are you saying.. you would like to see the ability to add arbitary patterns at run time?
Getting closer. Just about there... but I would set those class-level variables to None and then just test the variables like:
Note that the return statement should NOT be under the else clause that you had before! This means that in the case where a pre-compiled Rx was not found in the cache-variable, it will perform the compilation but skip the return statement, meaning that the calling expression recieves the default return value of None.
<p>The idea I threw out about arbitrary patterns was just storing custom named patterns and lazy-compiling those as well. The user would then have to employ a method with the name as an argument or something similiar:
</p>
isValidForPattern. You can't call a method from a uninstantiated method withod an instance.
So here's a module function that stores the pattern in a class variable:
here's the testing code (BTW: als testing code should be placed in an
construct):
BTW: the method of testing the RE if they are of type string allows defining them at the start of the class definition for more clarity
</pre></pre>
isValidForPattern revisited. Of course you have to add a class variable '_patterns = {}' to the StringValidator class. Or you alter the storePattern method thus:
ok. Ok. I have implemented all recommended changes, please review the code.
Short email addresses. It is important to note that email adresses _can_ be shorter than 7 characters. Although domain names shorter than 2 letters are not allowed in .com .net and .org, many ISO country domain registrars do allow one letter domain names. For example in Denmark, where I live. For example see http://www.n.dk
So a@n.dk is in fact a potential valid email address, which would not get through your regexp :-)
a slight enhancement. better to bind, e.g., self.__class__.RE_ALPHA=etc, rather than self.RE_ALPHA, so all other instances of the same class will transparently re-use the compiled re (keep per-class, not per-instance).
! Great idea! :]
! Great idea! :]
RE_EMAIL regex. The regex for isEmail will return true even if the string being validated contains space characters. The following regex allows for any character (just as the original) except a space:
Classmethod. Why not make definePattern a classmethod i.e.
definePattern = classmethod(definePattern)
then you define patterns without requiring an instance i.e.
StringValidator.definePattern(....)