Latest recipes by Steven D'Aprano

Guard against an exception in the wrong place (Python)

2017-06-25T17:17:43-07:00

Python recipe 580808 by Steven D'Aprano (context, exception, guard, manager). Revision 2.

Sometimes exception handling can obscure bugs unless you guard against a particular exception occurring in a certain place. One example is that accidentally raising StopIteration inside a generator will halt the generator instead of displaying a traceback. That was solved by PEP 479, which automatically has such StopIteration exceptions change to RuntimeError. See the discussion below for further examples.

Here is a class which can be used as either a decorator or context manager for guarding against the given exceptions. It takes an exception (or a tuple of exceptions) as argument, and if the wrapped code raises that exception, it is re-raised as another exception type (by default RuntimeError).

For example:

try:
    with exception_guard(ZeroDivisionError):
        1/0  # raises ZeroDivisionError
except RuntimeError:
    print ('ZeroDivisionError replaced by RuntimeError')


@exception_guard(KeyError)
def demo():
    return {}['key']  # raises KeyError

try:
    demo()
except RuntimeError:
    print ('KeyError replaced by RuntimeError')

Collection Pipeline in Python (Python)

2016-03-16T14:45:02-07:00

Python recipe 580625 by Steven D'Aprano (bash, filter, map, pipe, pipline).

A powerful functional programming technique is the use of pipelines of functions. If you have used shell scripting languages like bash, you will have used this technique. For instance, to count the number of files or directories, you might say: ls | wc -l. The output of the first command is piped into the input of the second, and the result returned.

We can set up a similar pipeline using Lisp-like Map, Filter and Reduce special functions. Unlike the standard Python map, filter and reduce, these are designed to operate in a pipeline, using the same | syntax used by bash and other shell scripting languages:

>>> data = range(1000)
>>> data | Filter(lambda n: 20 < n < 30) | Map(float) | List
[21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0]

The standard Python functional tools is much less attractive, as you have to write the functions in the opposite order to how they are applied to the data. This makes it harder to follow the logic of the expression.

>>> list(map(float, filter(lambda n: 20 < n < 30, data)))
[21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0]

We can also end the pipeline with a call to Reduce to collate the sequence into a single value. Here we take a string, extract all the digits, convert to ints, and multiply:

>>> from operator import mul
>>> "abcd12345xyz" | Filter(str.isdigit) | Map(int) | Reduce(mul)
120

Safely and atomically write to a file (Python)

2016-03-23T14:14:26-07:00

Python recipe 579097 by Steven D'Aprano (atomic, file, save). Revision 3.

Saving the user's data is risky. If you write to a file directly, and an error occurs during the write, you may corrupt the file and lose the user's data. One approach to prevent this is to write to a temporary file, then only when you know the file has been written successfully, over-write the original. This function returns a context manager which can make this more convenient.

Update: this now uses os.replace when available, and hopefully will work better on Windows.

Call out to an external editor (Python)

2014-09-01T18:26:51-07:00

Python recipe 578926 by Steven D'Aprano (editing, editor, external, text). Revision 2.

Here's a function that lets you use Python to wrap calls to an external editor. The editor can be an command line editor, like venerable old "ed", or something more powerful like nano, vim or emacs, and even GUI editors. After the editor quits, the text you typed in the editor is returned by the function.

A simple example, using the (rather cryptic) 'ed' editor on Linux. For the benefit of those unfamiliar with 'ed', I have annotated the editor session with comments.

>>> status, text = edit('ed')
0                        ## ed prints the initial number of lines
a                        ## start "append" mode
Hello World!
Goodbye now
.                        ## stop appending
w                        ## write the file to disk
25                       ## ed prints the number of bytes written
q                        ## quit ed and return to Python
>>> status
0
>>> print text
Hello World!
Goodbye now

Yet Another NamedTuple (Python)

2014-08-02T02:56:12-07:00

Python recipe 578918 by Steven D'Aprano (shortcuts).

Yet another look at Raymond Hettinger's excellent "namedtuple" factory. Unlike Raymond's version, this one minimizes the use of exec. In the original, the entire inner class is dynamically generated then exec'ed, leading to the bulk of the code being inside a giant string template. This version uses a regular inner class for everything except the __new__ method.

Method chaining or cascading (Python)

2016-09-01T12:34:17-07:00

Python recipe 578770 by Steven D'Aprano (cascade, cascading, chaining, method).

A frequently missed feature of built-ins like lists and dicts is the ability to chain method calls like this:

x = []
x.append(1).append(2).append(3).reverse().append(4)
# x now equals [3, 2, 1, 4]

Unfortunately this doesn't work, as mutator methods return None rather than self. One possibility is to design your class from the beginning with method chaining in mind, but what do you do with those like the built-ins which aren't?

This is sometimes called method cascading. Here's a proof-of-concept for an adapter class which turns any object into one with methods that can be chained.

Using ChainMap for embedded namespaces (Python)

2012-10-03T18:12:50-07:00

Python recipe 578279 by Steven D'Aprano (chainmap, metaclass, namespace).

The Zen of Python tells us:

Namespaces are one honking great idea -- let's do more of those!

Python already has an excellent namespace type, the module, but the problem with modules is that they have to live in a separate file, and sometimes you want the convenience of a single file while still encapsulating your code into namespaces. That's where classes are the usual solution, but classes need to be instantiated and methods need to be defined with a self parameter.

C++ has "namespaces" for encapsulating related objects and dividing the global scope into sub-scopes. Can we do the same in Python?

With a bit of metaclass trickery and the new ChainMap type from Python 3.3, we can!

Get single keypress (Python)

2011-12-06T08:46:33-08:00

Python recipe 577977 by Steven D'Aprano (getch, key).

Here's a platform-independent module that exposes a single function, getch, which reads stdin for a single character. It uses msvcrt.getch on Windows, and should work on any platform that supports the tty and termios modules (e.g. Linux).

This has been tested on Python 2.4, 2.5 and 2.6 on Linux.

Benchmark code with the with statement (Python)

2011-10-08T09:53:06-07:00

Python recipe 577896 by Steven D'Aprano (benchmark, speed, time, timer).

Inspired by this post I wrote this context manager to benchmark code blocks or function calls.

Usage is incredibly simple:

with Timer():
    ...  # code to benchmark goes here

The time taken (in seconds) will be printed when the code block completes. To capture the time taken programmatically is almost as easy:

t = Timer()
with t:
    ...  # code to benchmark goes here
time_taken = t.interval

Due to the difficulties of timing small snippets of code accurately, you should only use this for timing code blocks or function calls which take a significant amount of time to process. For micro-benchmarks, you should use the timeit module.

Equally-spaced floats part 2 (Python)

2011-09-27T15:48:00-07:00

Python recipe 577881 by Steven D'Aprano (float, range, spread).

See the recipe here for a description of the problem.

This variant allows the user to choose between two APIs, either by supplying the count, start value and end value:

>>> list(spread(5, 1.0, end=2.0))
[1.0, 1.2, 1.4, 1.6, 1.8]

or the count, start value and step size:

>>> list(spread(5, 1.0, step=-0.25))
[1.0, 0.75, 0.5, 0.25, 0.0]

As before, the count argument specifies how many subdivisions are made. The optional argument mode selects whether the start and end values are included. By default, start is included and end is not, and exactly count values are returned.

Generate equally-spaced floats (Python)

2011-09-24T18:49:39-07:00

Python recipe 577878 by Steven D'Aprano (float, range, spread).

Generating a list of equally-spaced floats can surprising due to floating point rounding. See, for example, the recipe for a floating point range. One way of avoiding some surprises is by changing the API: instead of specifying a start, stop and step values, instead use a start, stop and count:

>>> list(spread(0.0, 2.1, 7))
[0.0, 0.3, 0.6, 0.9, 1.2, 1.5, 1.8]

Like frange spread takes an optional mode argument to select whether the start and end values are included. By default, start is included and end is not, and exactly count values are returned.

Search sequences for sub-sequence (Python)

2011-08-19T05:17:00-07:00

Python recipe 577850 by Steven D'Aprano (find, searching, sequence, string, sublist, substring).

The list and tuple index() method and in operator test for element containment, unlike similar tests for strings, which checks for sub-strings:

>>> "12" in "0123"
True
>>> [1, 2] in [0, 1, 2, 3]
False

These two functions, search and rsearch, act like str.find() except they operate on any arbitrary sequence such as lists:

>>> search([1, "a", "b", 2, 3], ["b", 2])
2

Integer square root function (Python)

2011-08-04T05:29:16-07:00

Python recipe 577821 by Steven D'Aprano (integer, isqrt, math, mathematics, maths, root, square).

The integer square root function, or isqrt, is equivalent to floor(sqrt(x)) for non-negative x. For small x, the most convenient way to calculate isqrt is by calling int(x**0.5) or int(math.sqrt(x)), but if x is a large enough integer, the sqrt calculation overflows.

You can calculate the isqrt without any floating point maths, using just pure integer maths, allowing the function to operate with numbers far larger than possible with floats:

>>> n = 1234567*(10**1000)
>>> n2 = n*n
>>> math.sqrt(n2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: long int too large to convert to float
>>> isqrt(n2) == n
True

Enhancing dir() with globs (Python)

2011-07-01T04:03:46-07:00

Python recipe 577774 by Steven D'Aprano (dir, filter, glob, globbing).

dir() is useful for interactively exploring the attributes and methods of objects at the command line. But sometimes dir() returns a lot of information:

>>> len(dir(decimal.Decimal))  # too much information!
137

It can be helpful to filter the list of names returned. This enhanced version of dir does exactly that, using simple string globbing:

>>> dir(decimal.Decimal, '*log*')  # just attributes with "log" in the name
['_fill_logical', '_islogical', '_log10_exp_bound', 'log10', 'logb', 
'logical_and', 'logical_invert', 'logical_or', 'logical_xor']

Calculate the MRO of a class (Python)

2011-06-11T08:31:09-07:00

Python recipe 577748 by Steven D'Aprano (c3, classes, method, mro, order, resolution).

This function allows you to calculate the Method Resolution Order (MRO, or sometimes linearization) of a class or base classes. This is the so-called "C3" algorithm, as used by Python (new-style classes, from version 2.3 and higher). The MRO is the order of base classes that Python uses to search for methods and attributes. For single inheritance, the MRO is obvious and straight-forward and not very exciting, but for multiple inheritance it's not always obvious what the MRO should be.

Validate ABNs (Australian Business Numbers) (Python)

2011-05-13T03:22:12-07:00

Python recipe 577692 by Steven D'Aprano (abn, validation).

This is a validation function for Australian Business Numbers (ABNs). It tests whether an integer or string is a valid ABN (but not whether it is a legal ABN, i.e. if it belongs to the person or business entity that is quoting it).

Validate ACNs (Australian Company Numbers) (Python)

2011-05-13T03:15:35-07:00

Python recipe 577691 by Steven D'Aprano (acn, validation).

This is a validation function for Australian Company Numbers (ACNs). It tests whether the integer or string is a valid ACN (but not whether it is a legal ACN, i.e. if it belongs to the company that is quoting it).

map_longest and map_strict (Python)

2011-05-06T17:35:17-07:00

Python recipe 577687 by Steven D'Aprano (iteration, map). Revision 2.

In Python 3, the map builtin silently drops any excess items in its input:

>>> a = [1, 2, 3]
>>> b = [2, 3, 4]
>>> c = [3, 4, 5, 6, 7]
>>> list(map(lambda x,y,z: x*y+z, a, b, c))
[5, 10, 17]

In Python 2, map pads the shorter items with None, while itertools.imap drops the excess items. Inspired by this, and by itertools.zip_longest, I have map_longest that takes a value to fill missing items with, and map_strict that raises an exception if a value is missing.

>>> list(map_longest(lambda x,y,z: x*y+z, a, b, c, fillvalue=0))
[5, 10, 17, 6, 7]
>>> list(map_strict(lambda x,y,z: x*y+z, a, b, c))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 9, in map_strict
ValueError: too few items in iterable

Locate and import Python's standard regression tests (Python)

2010-09-17T11:31:31-07:00

Python recipe 577394 by Steven D'Aprano (import, path, regression, testing).

The Python standard library comes with an extensive set of regression tests. I had need to import and run some of these tests from my own code, but the test directory is not in Python's path. This simple helper function solves the problem.

Approximately Equal (Python)

2010-03-17T17:05:51-07:00

Python recipe 577124 by Steven D'Aprano (comparisons, distance, math, maths, numeric, string).

Generic "approximately equal" function for any object type, with customisable error tolerance.

When called with float arguments, approx_equal(x, y[, tol[, rel]) compares x and y numerically, and returns True if y is within either absolute error tol or relative error rel of x, otherwise return False. The function defaults to sensible default values for tol and rel.

For any other pair of objects, approx_equal() looks for a method __approx_equal__ and, if found, calls it with arbitrary optional arguments. This allows types to define their own concept of "close enough".