Welcome, guest | Sign In | My Account | Store | Cart

Allows you to load modules from packages without hard-coding their class names in code; instead, they might be specified in a configuration file, as command-line parameters, or within an interface.

Python, 65 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
###################
##                #
## classloader.py #
##                #
###################

import sys, types

def _get_mod(modulePath):
    try:
        aMod = sys.modules[modulePath]
        if not isinstance(aMod, types.ModuleType):
            raise KeyError
    except KeyError:
        # The last [''] is very important!
        aMod = __import__(modulePath, globals(), locals(), [''])
        sys.modules[modulePath] = aMod
    return aMod

def _get_func(fullFuncName):
    """Retrieve a function object from a full dotted-package name."""
    
    # Parse out the path, module, and function
    lastDot = fullFuncName.rfind(u".")
    funcName = fullFuncName[lastDot + 1:]
    modPath = fullFuncName[:lastDot]
    
    aMod = _get_mod(modPath)
    aFunc = getattr(aMod, funcName)
    
    # Assert that the function is a *callable* attribute.
    assert callable(aFunc), u"%s is not callable." % fullFuncName
    
    # Return a reference to the function itself,
    # not the results of the function.
    return aFunc

def _get_class(fullClassName, parentClass=None):
    """Load a module and retrieve a class (NOT an instance).
    
    If the parentClass is supplied, className must be of parentClass
    or a subclass of parentClass (or None is returned).
    """
    aClass = _get_func(fullClassName)
    
    # Assert that the class is a subclass of parentClass.
    if parentClass is not None:
        if not issubclass(aClass, parentClass):
            raise TypeError(u"%s is not a subclass of %s" %
                            (fullClassName, parentClass))
    
    # Return a reference to the class itself, not an instantiated object.
    return aClass


######################
##       Usage      ##
######################

class StorageManager: pass
class StorageManagerMySQL(StorageManager): pass

def storage_object(aFullClassName, allOptions={}):
    aStoreClass = _get_class(aFullClassName, StorageManager)
    return aStoreClass(allOptions)

Usage:

When you design a toolkit or framework for other developers, you often need to provide them with a means of using custom classes. For example, if you design an application to only use MySQL as a back end, you will quickly find users of your framework asking for a PostgreSQL version. By allowing developers to specify which database they prefer, you reach a wider audience.

In many open-source applications, you may even leave the possibility open for others to extend your work, by creating custom classes of their own which implement your interface. In our database example, some enterprising developer might wish to take your framework and write an Oracle version. However, managing multiple builds for each database becomes unwieldy.

Ideally, whether we develop new classes ourselves or expect others to develop them, we would like to decide at deployment time which database we will use. If we provide a generic StorageManager class, we can then write specific subclasses for each database we support. We might at first provide a StorageManagerMySQL by default, and leave other, similar development for the future. If we have a single text configuration file (a la ConfigParser), for example, we can specify at installation time that we want to use PostgreSQL instead of MySQL with a setting like:

[StorageManager] Class: framework.databases.my_custom_stuff.StorageManagerPostgreSQL

This allows the users of our framework greater flexibility in the deployment of our code; which often leads to more widespread use.

Analysis:

If a desired class is known at compile-time, importing is trivial: from package import module.function. If the class or function you want to import is buried in a package hierarchy that you didn't create, the problem becomes more difficult.

_get_func() begins by parsing a full package name into its component parts. Whenever we specify a dotted package name, such as "framework.databases.my_custom_stuff.StorageManagerPostgreSQL", the word after the last dot will be the function name (or class name), and the word before that will be the filename of the module in which we find said function. The task, then, is to find and load that module file.

After a quick check (in sys.modules) to see if the module has already been loaded, we use the built-in function __import__() to load the module we desire. You may notice something odd about the call to __import__(): why is the last parameter a list whose only member is an empty string? This hack stems from a quirk about __import__(): if the last parameter is empty, loading class "A.B.C.D" actually only loads "A". If the last parameter is defined, regardless of what its value is, we end up loading "A.B.C". Once we have "A.B.C", we use getattr() to reference the function (or class) within the module.

If you want your custom module to be located using this technique, you must have its base path included in sys.path. For example, if you want to find "framework.databases.my_custom_stuff.StorageManagerPostgreSQL", and that module exists as "/var/lib/app/framework/databases/my_custom_stuff.py", your sys.path should include "/var/lib/app/". Windows users, invert slashes accordingly.

_get_class() does the same work as _get_func(), but adds the ability to verify that the retrieved class is a subclass of another, base class.

Known issues:

If you decide to wrap these in their own module, you may experience issues with importing the same module more than once. The items in sys.modules do not always reflect the full package associated with those modules. I recommend the narrowest import method:

from package.path.to.classloader import _get_func, _get_class

Credit and Motivation:

I can only claim credit for the assembly of the parts, here; all of the difficult bits I found in the Python documentation, or various places on the 'Net. You can read more about __import__() in the current Library Reference: http://www.python.org/doc/current/lib/built-in-funcs.html

If I'm stepping on anyone else's work, please let me know. I merely post it due to the large number of people looking for this technique here and only finding code to import without regard to packages, e.g.: http://dbforums.com/arch/97/2002/5/367588

The first versions I submitted of this recipe were much uglier. Thanks to Peter Otten on comp.lang.python for setting me straight (twice!): http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&selm=bk5jld%24jkc%2400%241%40news.t-online.com

4 comments

Robert Brewer (author) 20 years, 7 months ago  # | flag

Oops. You don't need to import the imp module, just sys. A previous version of this code used imp.

Denis Otkidach 20 years ago  # | flag

You don't need to look in sys.modules and store imported module there. Built-in __import__ already do this. So, _get_mod can be simplified:

def _get_mod(modulePath):
    return __import__(modulePath, globals(), locals(), ['*'])
Brian Lee 17 years, 11 months ago  # | flag

getting a new instance. This may have been obvious to some people but it wasn't for me. The following is an example of how to get a new instance:

import new

tmpClass = _get_class("command.TestCommand")

testCommand = new.instance(tmpClass)

http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html?page=1

Cristian Consonni 13 years, 5 months ago  # | flag

I fixed a couple of bugs in the above code:

  • added an if/else statement to storage_object since class constructors with no options where rising an error
  • if the class is defined in the main the module used to import is "__main__"
  • added an example of using storage_object()
#! /usr/bin/python
# -*- coding: UTF-8 -*-

## adapted from:
## {{{ http://code.activestate.com/recipes/223972/ (r4)
###################
##                #
## classloader.py #
##                #
###################

import sys, types

def _get_mod(modulePath):
return __import__(modulePath, globals(), locals(), ['*'])

def _get_func(fullFuncName):
"""Retrieve a function object from a full dotted-package name."""

# Parse out the path, module, and function
lastDot = fullFuncName.rfind(u".")
funcName = fullFuncName[lastDot + 1:]
if lastDot == -1:
  modPath = '__main__'
else:
  modPath = fullFuncName[:lastDot]

aMod = _get_mod(modPath)
aFunc = getattr(aMod, funcName)

# Assert that the function is a *callable* attribute.
assert callable(aFunc), u"%s is not callable." % fullFuncName

# Return a reference to the function itself,
# not the results of the function.
return aFunc

def _get_class(fullClassName, parentClass=None):
"""Load a module and retrieve a class (NOT an instance).

If the parentClass is supplied, className must be of parentClass
or a subclass of parentClass (or None is returned).
"""
aClass = _get_func(fullClassName)

# Assert that the class is a subclass of parentClass.
if parentClass is not None:
    if not issubclass(aClass, parentClass):
    raise TypeError(u"%s is not a subclass of %s" %
            (fullClassName, parentClass))

# Return a reference to the class itself, not an instantiated object.
return aClass


######################
##       Usage      ##
######################
if __name__ == '__main__':
  class StorageManager: pass
  class StorageManagerMySQL(StorageManager): pass

  def storage_object(aFullClassName, allOptions={}):
  aStoreClass = _get_class(aFullClassName, StorageManager)
  if allOptions == {}:
    return aStoreClass()
  else:
    return aStoreClass(allOptions)

  sto = storage_object('StorageManager')
  print sto
## end of http://code.activestate.com/recipes/223972/ }}}

here's my output:

<__main__.StorageManager instance at 0x7fc89458d8c0>

as expected.

Created by Robert Brewer on Sun, 21 Sep 2003 (PSF)
Python recipes (4591)
Robert Brewer's recipes (3)

Required Modules

Other Information and Tasks