Allows you to load modules from packages without hard-coding their class names in code; instead, they might be specified in a configuration file, as command-line parameters, or within an interface.
Python, 65 lines
When you design a toolkit or framework for other developers, you often need to provide them with a means of using custom classes. For example, if you design an application to only use MySQL as a back end, you will quickly find users of your framework asking for a PostgreSQL version. By allowing developers to specify which database they prefer, you reach a wider audience.
In many open-source applications, you may even leave the possibility open for others to extend your work, by creating custom classes of their own which implement your interface. In our database example, some enterprising developer might wish to take your framework and write an Oracle version. However, managing multiple builds for each database becomes unwieldy.
Ideally, whether we develop new classes ourselves or expect others to develop them, we would like to decide at deployment time which database we will use. If we provide a generic StorageManager class, we can then write specific subclasses for each database we support. We might at first provide a StorageManagerMySQL by default, and leave other, similar development for the future. If we have a single text configuration file (a la ConfigParser), for example, we can specify at installation time that we want to use PostgreSQL instead of MySQL with a setting like:
[StorageManager] Class: framework.databases.my_custom_stuff.StorageManagerPostgreSQL
This allows the users of our framework greater flexibility in the deployment of our code; which often leads to more widespread use.
If a desired class is known at compile-time, importing is trivial: from package import module.function. If the class or function you want to import is buried in a package hierarchy that you didn't create, the problem becomes more difficult.
_get_func() begins by parsing a full package name into its component parts. Whenever we specify a dotted package name, such as "framework.databases.my_custom_stuff.StorageManagerPostgreSQL", the word after the last dot will be the function name (or class name), and the word before that will be the filename of the module in which we find said function. The task, then, is to find and load that module file.
After a quick check (in sys.modules) to see if the module has already been loaded, we use the built-in function __import__() to load the module we desire. You may notice something odd about the call to __import__(): why is the last parameter a list whose only member is an empty string? This hack stems from a quirk about __import__(): if the last parameter is empty, loading class "A.B.C.D" actually only loads "A". If the last parameter is defined, regardless of what its value is, we end up loading "A.B.C". Once we have "A.B.C", we use getattr() to reference the function (or class) within the module.
If you want your custom module to be located using this technique, you must have its base path included in sys.path. For example, if you want to find "framework.databases.my_custom_stuff.StorageManagerPostgreSQL", and that module exists as "/var/lib/app/framework/databases/my_custom_stuff.py", your sys.path should include "/var/lib/app/". Windows users, invert slashes accordingly.
_get_class() does the same work as _get_func(), but adds the ability to verify that the retrieved class is a subclass of another, base class.
If you decide to wrap these in their own module, you may experience issues with importing the same module more than once. The items in sys.modules do not always reflect the full package associated with those modules. I recommend the narrowest import method:
from package.path.to.classloader import _get_func, _get_class
Credit and Motivation:
I can only claim credit for the assembly of the parts, here; all of the difficult bits I found in the Python documentation, or various places on the 'Net. You can read more about __import__() in the current Library Reference: http://www.python.org/doc/current/lib/built-in-funcs.html
If I'm stepping on anyone else's work, please let me know. I merely post it due to the large number of people looking for this technique here and only finding code to import without regard to packages, e.g.: http://dbforums.com/arch/97/2002/5/367588
The first versions I submitted of this recipe were much uglier. Thanks to Peter Otten on comp.lang.python for setting me straight (twice!): http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&selm=bk5jld%24jkc%2400%241%40news.t-online.com