ActiveState Code

Recipe 66429: Thread-specific storage


The get_thread_storage() function described below returns a thread-specific storage dictionary. (It is a generalization of the get_transaction() function from ZODB, the object database underlying Zope.) The returned dictionary can be used to store data that is "private" to the thread.

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
try:
    import thread
except:
    """We're running on a single-threaded OS (or the Python interpreter has
    not been compiled to support threads) so return a standard dictionary.

    """
    _tss = {}
    def get_thread_storage():
        return _tss
else:
    _tss = {}
    _tss_lock = thread.allocate_lock()
    def get_thread_storage():
        """Return a thread-specific storage dictionary."""
        thread_id = thread.get_ident() # Identify the calling thread.
        tss = _tss.get(thread_id)
        if tss is None: # First time being called by this thread.
            try: # Entering critical section.
                _tss_lock.acquire()
                _tss[thread_id] = tss = {} # Create a thread-specific dictionary.
            finally:
                _tss_lock.release()
        return tss

Discussion

One benefit of multi-threaded programs is that all of the threads can share global objects. Sometimes, however, each thread needs its own storage to, for example, store a network or database connection unique to itself. The get_thread_storage() function returns a dictionary object that is unique to each thread. For an exhaustive treatment of thread-specific storage (albeit aimed at C++ programmers) see http://www.cs.wustl.edu/~schmidt/PDF/TSS-pattern.pdf.

Comments

  1. 1. At 4:42 p.m. on 25 jul 2001, John E. Barham (the author) said:

    Bug-fix to single-threaded version. Oops...

    try:
        import thread
    except:
        def get_thread_storage():
            return {}
    

    is wrong since it returns a different dictionary instance every time it is called. It should be:

    try:
        import thread
    except:
        _tss = {}
        def get_thread_storage():
            return _tss
    
  2. 2. At 3:47 a.m. on 26 jul 2001, Sami Hangaslammi said:

    Problem with get_ident(). The id's returned by get_ident are only unique to the currently active threads, so when a thread dies its id (and storage) might get used by another thread.

    I don't know how often this could bite you in practice, but maybe there should be a clear_storage()-function that a thread can call when it's done.

  3. 3. At 4:06 a.m. on 30 jul 2001, Jens Engel said:

    _tss dictionary is not multi-threading protected. The TSS implementation is based on a "global" variable that can be accessed by all threads within a process. Currently, potential problems occur when a TSS entry is created for a new thread or when a TSS entry is removed. Both action can lead to a restructering of the internal dictionary structure. Threads that access their TSS variable at this may obtain a access to "dangling" or corrupted TSS data.

    As already stated by a comment above, a thread that creates an TSS entry shoulds also removed again when it dies or is killed.

  4. 4. At 3:14 p.m. on 30 jul 2001, John E. Barham (the author) said:

    Thread-safe get_thread_storage(). Yup, there does exist a potential race condition when changing the global _tss dictionary. Anyway, this should fix that:

    ...
    else:
        _tss = {}
        _tss_lock = thread.allocate_lock()
        def get_thread_storage():
            """Return a thread-specific storage dictionary."""
            thread_id = thread.get_ident() # Identify the calling thread.
            tss = _tss.get(thread_id)
            if tss is None: # First time being called by this thread.
                try: # Entering critical section.
                    _tss_lock.acquire()
                    _tss[thread_id] = tss = {} # Create a thread-specific dictionary.
                finally:
                    _tss_lock.release()
            return tss
    

    As to the problem of deleting thread-specific storage on thread death, that is an issue since thread ids can be recycled. However, I use get_thread_storage() in a program that has a pool of "worker" threads that live as long as the main thread so it isn't a problem in this scenario.

    Writing a corresponding delete_thread_storage() is thus left as an Exercise for the Reader, ;), but it is symmetric to get_thread_storage().

  5. 5. At 11:12 a.m. on 8 aug 2001, John E. Barham (the author) said:

    Updated source reflects bug-fix comments. The updated source code now incorporates the changes I made in earlier comments.

  6. 6. At 8:39 a.m. on 6 nov 2001, Andres Tuells said:

    better get_thread_storage.

    def get_thread_storage(_get_ident=thread.get_ident):#make thread.get_ident a local var
    
    """Return a thread-specific storage dictionary."""
    
    thread_id = _get_ident() # Identify the calling thread.
    
    try:
    
        return _tss[thread_id]
    
    except KeyError:
    
        tss = _tss[thread_id]={}
    
        return tss
    

    I don't need a lock because only a exists one thread with thread_id.

Sign in to comment