Welcome, guest | Sign In | My Account | Store | Cart

The get_thread_storage() function described below returns a thread-specific storage dictionary. (It is a generalization of the get_transaction() function from ZODB, the object database underlying Zope.) The returned dictionary can be used to store data that is "private" to the thread.

Python, 24 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
try:
    import thread
except:
    """We're running on a single-threaded OS (or the Python interpreter has
    not been compiled to support threads) so return a standard dictionary.

    """
    _tss = {}
    def get_thread_storage():
        return _tss
else:
    _tss = {}
    _tss_lock = thread.allocate_lock()
    def get_thread_storage():
        """Return a thread-specific storage dictionary."""
        thread_id = thread.get_ident() # Identify the calling thread.
        tss = _tss.get(thread_id)
        if tss is None: # First time being called by this thread.
            try: # Entering critical section.
                _tss_lock.acquire()
                _tss[thread_id] = tss = {} # Create a thread-specific dictionary.
            finally:
                _tss_lock.release()
        return tss

One benefit of multi-threaded programs is that all of the threads can share global objects. Sometimes, however, each thread needs its own storage to, for example, store a network or database connection unique to itself. The get_thread_storage() function returns a dictionary object that is unique to each thread. For an exhaustive treatment of thread-specific storage (albeit aimed at C++ programmers) see http://www.cs.wustl.edu/~schmidt/PDF/TSS-pattern.pdf.

6 comments

John E. Barham (author) 22 years, 8 months ago  # | flag

Bug-fix to single-threaded version. Oops...

try:
    import thread
except:
    def get_thread_storage():
        return {}

is wrong since it returns a different dictionary instance every time it is called. It should be:

try:
    import thread
except:
    _tss = {}
    def get_thread_storage():
        return _tss
Sami Hangaslammi 22 years, 8 months ago  # | flag

Problem with get_ident(). The id's returned by get_ident are only unique to the currently active threads, so when a thread dies its id (and storage) might get used by another thread.

I don't know how often this could bite you in practice, but maybe there should be a clear_storage()-function that a thread can call when it's done.

Jens Engel 22 years, 8 months ago  # | flag

_tss dictionary is not multi-threading protected. The TSS implementation is based on a "global" variable that can be accessed by all threads within a process. Currently, potential problems occur when a TSS entry is created for a new thread or when a TSS entry is removed. Both action can lead to a restructering of the internal dictionary structure. Threads that access their TSS variable at this may obtain a access to "dangling" or corrupted TSS data.

As already stated by a comment above, a thread that creates an TSS entry shoulds also removed again when it dies or is killed.

John E. Barham (author) 22 years, 8 months ago  # | flag

Thread-safe get_thread_storage(). Yup, there does exist a potential race condition when changing the global _tss dictionary. Anyway, this should fix that:

...
else:
    _tss = {}
    _tss_lock = thread.allocate_lock()
    def get_thread_storage():
        """Return a thread-specific storage dictionary."""
        thread_id = thread.get_ident() # Identify the calling thread.
        tss = _tss.get(thread_id)
        if tss is None: # First time being called by this thread.
            try: # Entering critical section.
                _tss_lock.acquire()
                _tss[thread_id] = tss = {} # Create a thread-specific dictionary.
            finally:
                _tss_lock.release()
        return tss

As to the problem of deleting thread-specific storage on thread death, that is an issue since thread ids can be recycled. However, I use get_thread_storage() in a program that has a pool of "worker" threads that live as long as the main thread so it isn't a problem in this scenario.

Writing a corresponding delete_thread_storage() is thus left as an Exercise for the Reader, ;), but it is symmetric to get_thread_storage().

John E. Barham (author) 22 years, 7 months ago  # | flag

Updated source reflects bug-fix comments. The updated source code now incorporates the changes I made in earlier comments.

Andres Tuells 22 years, 4 months ago  # | flag

better get_thread_storage.

def get_thread_storage(_get_ident=thread.get_ident):#make thread.get_ident a local var

"""Return a thread-specific storage dictionary."""

thread_id = _get_ident() # Identify the calling thread.

try:

    return _tss[thread_id]

except KeyError:

    tss = _tss[thread_id]={}

    return tss

I don't need a lock because only a exists one thread with thread_id.

Created by John E. Barham on Wed, 25 Jul 2001 (PSF)
Python recipes (4591)
John E. Barham's recipes (1)

Required Modules

Other Information and Tasks