Welcome, guest | Sign In | My Account | Store | Cart

Return a timestamp string for @lru_cache decorated functions.

The returned timestamp is used as the value of an extra parameter to @lru_cache decorated functions, allowing for more control over how often cache entries are refreshed. The lru_timestamp function should be called with the same refresh_interval value for a given @lru_cache decorated function. The returned timestamp is for the benefit of the @lru_cache decorator and is normally not used by the decorated function.

Positional arguments: refresh_interval -- in minutes (default 60), values less than 1 are coerced to 1, values more than 1440 are coerced to 1440

Python, 104 lines
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
#!/usr/bin/python3

""" Test script for lru_timestamp function.

usage: lru.py [-h] [-r REFRESH] [-s SLEEP]

optional arguments:
  -h, --help            show this help message and exit
  -r REFRESH, --refresh REFRESH
                        refresh interval (default 60 min)
  -s SLEEP, --sleep SLEEP
                        sleep interval (default 10 min)

"""

import argparse
import datetime
import functools
import random
import time


def lru_timestamp(refresh_interval=60):
    """ Return a timestamp string for @lru_cache decorated functions.

    The returned timestamp is used as the value of an extra parameter
    to @lru_cache decorated functions, allowing for more control over
    how often cache entries are refreshed. The lru_timestamp function
    should be called with the same refresh_interval value for a given
    @lru_cache decorated function.  The returned timestamp is for the
    benefit of the @lru_cache decorator and is normally not used by
    the decorated function.

    Positional arguments:
    refresh_interval -- in minutes (default 60), values less than 1
                        are coerced to 1, values more than 1440 are
                        coerced to 1440

    """

    if not isinstance(refresh_interval, int):
        raise TypeError('refresh_interval must be an int from 1-1440')

    dt = datetime.datetime.now()

    if refresh_interval > 60:
        refresh_interval = min(refresh_interval, 60*24)
        fmt = '%Y%m%d'
        minutes = dt.hour * 60
    else:
        refresh_interval = max(1, refresh_interval)
        fmt = '%Y%m%d%H'
        minutes = dt.minute

    ts = dt.strftime(fmt)
    age = minutes // refresh_interval
    return '{0}:{1:d}'.format(ts, age)


@functools.lru_cache()
def calulate(x, y, timestamp):
    """ Return random int for testing lru_timestamp function."""

    print('performing calculation (not from cache), timestamp:', timestamp)
    return random.randint(x, y)


def init():
    """ Return parsed command line args."""

    random.seed()
    parser = argparse.ArgumentParser(fromfile_prefix_chars='@')

    parser.add_argument('-r', '--refresh', type=int, dest='refresh',
                        default=60, help='refresh interval (default 60 min)')

    parser.add_argument('-s', '--sleep', type=int, dest='sleep', default=10,
                        help='sleep interval (default 10 min)')

    return parser.parse_args()


def main():
    """ Script main."""

    args = init()
    print('refresh interval (min):', args.refresh)
    print('sleep interval (min):', args.sleep)
    print()
    refresh = args.refresh
    doze = args.sleep * 60

    #num = calulate(1, 1000, lru_timestamp('junk'))
    #num = calulate(1, 1000, lru_timestamp(1.22))
    #num = calulate(1, 1000, lru_timestamp(-1))
    #num = calulate(1, 1000, lru_timestamp(2000))

    while True:
        num = calulate(1, 1000, lru_timestamp(refresh))
        print('calculation returned', num)
        time.sleep(doze)

if __name__ == '__main__':
    main()

Rationale:

Some functions have input parameters that rarely change, but yet return different results over time. It would be nice to have a ready-made solution to force lru_cache entries to be refreshed at specified time intervals.

An common example is using a stable userid to read user information from a database. By itself, the lru_cache decorator can be used to cache the user information and prevent unnecessary i/o. However, if a given user's information is updated in the database, but the associated lru_cache entry has not yet been discarded, the application will be using stale data. The lru_timestamp function is a simple, ready-made helper function that gives the developer more control over the age of lru_cache entries in such situations.

Sample usage:

@functools.lru_cache()
def user_info(userid, timestamp):
    # expensive database i/o, but value changes over time
    # the timestamp parameter is normally not used, it is
    # for the benefit of the @lru_cache decorator
    pass

# read user info from database, if not in cache or
# older than 120 minutes
info = user_info('johndoe', lru_timestamp(120))

1 comment

Zhuiguang Liu 7 years, 3 months ago  # | flag

To clarify, this function returns the nearest timestamp from the hour or day that is a multiple of the interval given. This means that any cache entry can actually expire BEFORE input interval actually passes in its full duration, and does not guarantee that the cache entry will survive for said interval - even without any other new entries that would evict said cache entry.

For instance, if the current hour is 16 (4PM), and we want to preserve a cache entry for 3 hours (180 minutes), the returned timestamp would be of the form YYYYMMDD:5 (16 * 60 // 180 = 5). However, at 6PM (18 hours), if the request for the same cache entry is received, the function would return YYYYMMDD:6 (18 * 60 // 180 = 6), and it would effectively invalidate the cached entry even though the time interval has not expired yet.