| Store | Cart

Re: [Distutils] PEP for dependencies on libraries like BLAS (was: Re: Working toward Linux wheel support)

From: Nathaniel Smith <n...@pobox.com>
Wed, 12 Aug 2015 23:08:53 -0700
On Wed, Aug 12, 2015 at 8:10 PM, Robert Collins
<robe...@robertcollins.net> wrote:
> On 13 August 2015 at 12:51, Nathaniel Smith <n...@pobox.com> wrote:>> On Aug 12, 2015 16:49, "Robert Collins" <robe...@robertcollins.net> wrote:>>>>>> I'm not sure what will be needed to get the PR accepted; At PyCon AU>>> Tennessee Leuwenberg started drafting a PEP for the expression of>>> dependencies on e.g. BLAS - its been given number 497, and is in the>>> packaging-peps repo; I'm working on updating it now.>>>> I wanted to take a look at this PEP, but I can't seem to find it. PEP 497:>>   https://www.python.org/dev/peps/pep-0497/>> appears to be something else entirely?>>>> I'm a bit surprised to hear that such a PEP is needed. We (= numpy devs)>> have actively been making plans to ship a BLAS wheel on windows, and AFAICT>> this is totally doable now -- the blocker is windows toolchain issues, not>> pypa-related infrastructure.>>>> Specifically the idea is to have a wheel that contains the shared library as>> a regular old data file, plus a stub python package that knows how to find>> this data file and how to make it accessible to the linker. So>> numpy/__init__.py would start by calling:>>>> import pyopenblas1>> # on Linux modifies LD_LIBRARY_PATH,>> # on Windows uses ctypes to preload... whatever>> pyopenblas1.enable()>>>> and then get on with things, or the build system might do:>>>> import pyopenblas1>> pyopenblas1.get_header_directories()>> pyopenblas1.get_linker_directories()>>

Thanks to James for sending on the link!

Two main thoughts, now that I've read it over:

1) The motivating example is somewhat confused -- the draft says:

+ The example provided in the abstract is a
+ hypothetical package which needs versions of numpy and scipy, both of which
+ must have been compiled to be aware of the ATLAS compiled set of
linear algebra
+ libraries (for performance reasons). This sounds esoteric but is, in fact, a
+ routinely encountered situation which drives people towards using the
+ alternative packaging for scientific python environments.

Numpy and scipy actually work hard to export a consistent, append-only
ABI regardless of what libraries are used underneath. (This is
actually by far our biggest issue with wheels -- that there's still no
way to tag the numpy ABI as part of the ABI string, so in practice
it's just impossible to ever have a smooth migration to a new ABI and
we have no choice but to forever maintain compatibility with numpy
0.1. But that's not what this proposal addresses.) Possibly part of
the confusion here is that Christoph Gohlke's popular numpy+scipy
builds use a hack where instead of making the wheels self-contained
via statically linking or something like that, then he ships the
actual libBLAS.dll inside the numpy wheel, and then the scipy wheel
has some code added that magically "knows" that there is this special
numpy wheel that it can find libBLAS.dll inside and use it directly
from scipy's own extensions. But this coupling is pretty much just
broken, and it directly motivates the blas-in-its-own-wheel design I
sketched out above.

(I guess the one exception is that if you have a numpy or scipy build
that dynamically links to a library like BLAS, and then another
extension that links to a different BLAS with an incompatible ABI, and
the two BLAS libraries have symbol name collisions, then that could be
a problem because ELF is frustrating like that. But the obvious
solution here is to be careful about how you do your builds -- either
by using static linking, or making sure that incompatible ABIs get
different symbol names.)

Anyway, this doesn't particularly undermine the PEP, but it would be
good to use a more realistic motivating example.

2) AFAICT, the basic goal of this PEP is to provide machinery to let
one reliably build a wheel for some specific version of some specific
distribution, while depending on vendor-provided libraries for various
external dependencies, and providing a nice user experience (e.g.,
telling users explicitly which vendor-provided libraries they need to
install). I say this because strings like "libblas1.so" or "kernel.h"
do not define any fixed ABI or APIs, unless you are implicitly scoping
to some particular distribution with at least some minimum version

It seems like a reasonable effort at solving this problem, and I guess
there are probably some people somewhere that have this problem, but
my concern is that I don't actually know any of those people. The
developers I know instead have the problem of, they want to be able to
provide a small finite number of binaries (ideally six binaries per
Python version: {32 bit, 64 bit} * {windows, osx, linux}) that
together will Just Work on 99% of end-user systems. And that's the
problem that Enthought, Continuum, etc., have been solving for years,
and which wheels already mostly solve on windows and osx, so it seems
like a reasonable goal to aim for. But I don't see how this PEP gets
us any closer to that. Again, not really a criticism -- these goals
aren't contradictory and it's great if pip ends up being able to
handle both common and niche use cases. But I want to make sure that
we're clear that these goals are different and which one each proposal
is aimed at.

>> This doesn't help if you want to declare dependencies on external, system>> managed libraries and have those be automatically somehow provided or>> checked for, but to me that sounds like an impossible boil-the-ocean project>> anyway, while the above is trivial and should just work.>> Well, have a read of the draft.>> Its a solved problem by e.g. conda, apt, yum, nix and many others.

None of these projects allow a .deb to depend on .rpms etc. -- they
all require that they own the whole world with some narrow, carefully
controlled exceptions (e.g. anaconda requires some non-trivial runtime
on the host system -- glibc, glib, pcre, expat, ... -- but it's a
single fixed set that they've empirically determined is close enough
to universally available in practice). The "boil the ocean" part is
the part where everybody who wants to distribute wheels has to go
around and figure out every possible permutation of ABIs on every
possible external packaging system and provide separate wheels for
each of them.

> Uploading system .so's is certainly also an option, and I see no> reason why we can't do both.>> I do know that distribution vendors are likely to be highly allergic> to the idea of having regular shared libraries present as binaries,> but thats a different discussion :)

Yeah, but basically in the same way that they're allergic to all
wheels, period, so ... :-). I think in the long run the only realistic
approach is for most users to either be getting blas+numpy from some
external system like macports/conda/yum/... or else to be getting
blas+numpy from official wheels on pypi. And neither of these two
scenarios seems to benefit from the functionality described in this

(Final emphasis: this is all just my own opinion based on my
far-from-omniscient view of the packaging system, please tell me if
I'm making some ridiculous error, or if well-actually libBLAS is
special and there is some other harder case I'm not thinking of, etc.)


Nathaniel J. Smith -- http://vorpus.org
Distutils-SIG maillist  -  Dist...@python.org

Recent Messages in this Thread
Nathaniel Smith Aug 13, 2015 12:51 am
James Polley Aug 13, 2015 01:02 am
Olivier Grisel Aug 13, 2015 03:06 am
Robert Collins Aug 13, 2015 03:10 am
Nathaniel Smith Aug 13, 2015 06:08 am
David Cournapeau Aug 14, 2015 09:59 am
Reinout van Rees Aug 17, 2015 02:07 pm
Donald Stufft Aug 17, 2015 02:15 pm
Reinout van Rees Aug 17, 2015 08:56 pm
Nick Coghlan Aug 20, 2015 10:05 am
Wes Turner Aug 20, 2015 05:15 pm
Brett Cannon Aug 21, 2015 05:41 pm
Wes Turner Aug 21, 2015 06:30 pm
Wes Turner Aug 21, 2015 07:27 pm
Wes Turner Aug 21, 2015 07:34 pm
Wes Turner Aug 21, 2015 08:28 pm
Nick Coghlan Aug 22, 2015 10:33 am
Antoine Pitrou Aug 22, 2015 07:22 pm
Messages in this thread