An O(1) LRU cache. Short, sweet, and fast. No bells and whistles.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | def cache_generator(original_function, maxsize):
mapping = {}
mapping_get = mapping.get
root = [None, None]
root[:] = [root, root]
value = None
size = 0
PREV, NEXT, KEY, VALUE = 0, 1, 2, 3
while 1:
key = yield value
link = mapping_get(key, root)
if link is root:
value = original_function(key)
if size < maxsize:
size += 1
else:
old_prev, old_next, old_key, old_value = root[NEXT]
root[NEXT] = old_next
old_next[PREV] = root
del mapping[old_key]
last = root[PREV]
link = [last, root, key, value]
mapping[key] = last[NEXT] = root[PREV] = link
else:
link_prev, link_next, key, value = link
link_prev[NEXT] = link_next
link_next[PREV] = link_prev
last = root[PREV]
last[NEXT] = root[PREV] = link
link[PREV] = last
link[NEXT] = root
def make_cache(original_function, maxsize=100):
'Create a cache around a function that takes a single argument'
c = cache_generator(original_function, maxsize)
next(c)
return c.send
if __name__ == '__main__':
p = make_cache(ord, maxsize=3)
for c in 'abcdecaeaa':
print(c, p(c))
|
Simplified and highly optimized from the version that was added to the standard library.
The main optimization is to simplify the functionality (single argument only, no keyword arguments, no tracking of the hit/miss ratio, and no clear() method).
The next major optimization was to inline the relevant code from Python's OrderedDict implementation.
Another optimization is to use only one dictionary (the OrderedDict version uses two). It points at links organized as [PREV, NEXT, KEY, VALUE].
An interesting optimization is to roll all the code into generator and make calls using send(). This allows all of the state variables to be local variables and it saves the cost of creating a new stack frame on every call. Besides making the code very fast, it also simplifies the code quite a bit (34 lines of dirt simple Python instructions operating on lists and dictionaries).
The instructions were selected to be among Python's fastest (for example, the eval-loop has a fast path for a binary_subscription with integer index into a list; and it has a similar optimization for integer comparisons).
The code runs well on PyPy, Jython, and any version of CPython at or after Python2.5.
Neat!
maxsize
is off-by-one though;size
starts at0
so ifmaxsize == 1
, thensize > maxsize
is stillFalse
when one entry has been cached, so two entries get cached. Example:... outputs:
so three entries are stored.
Nice recipe. You could have items deleted from the cache garbage collected earlier if you replace line 21 by
Otherwise, the references in
old_key
andold_value
will keep the deleted value alive.Unfortunately, it's not reentrant, so it's not very useful for memoization.
raises
ValueError: generator already executing
.Oh, this code has a very big problem! Once an exception occurs on the cached function, this iterator will be terminated. Thus every call will raise a StopIteration exception.
Possible solution: use a
try...catch
block andyield value, the_exception_to_be_reraised or None
.