Issue #42 new

Dependencies should be installed one-at-a-time

Nathaniel Smith avatarNathaniel Smith created an issue

It is a sad fact that many packages have a setup.py which imports other packages. This is sub-optimal, but there is unfortunately no way to avoid it given the current state of python packaging tools. For example, any package which uses the Numpy C API has to import numpy in its setup.py in order to ask it for build information. So, for example, the packages 'scipy' and 'pandas' both have a setup-time requirement that numpy is already installed.

The official word from the 'pip' designers is that the only way to install such packages is by running multiple pip invocations: https://github.com/pypa/pip/issues/25

That is, this does not (and will never) work: pip install numpy scipy pandas

This does work, and is the only supported method: pip install numpy; pip install scipy; pip install pandas

My package that I want to test with tox depends on these other packages. But when I try to run tox, it always generates the first form (which doesn't work), and there's no way to tell it to generate the second form (which does). The result is that tox cannot produce working virtualenvs for testing.

AFAICT the only advantage of the first form is that if there is some problem with the packages, pip can notice earlier and avoid trashing your python install. However, tox only ever installs into throwaway virtualenvs, so this is not really an advantage. Therefore I'd suggest that tox always and unconditionally process the deps= option by calling 'pip install' separately for each entry, and in the order they are listed in the config file.

Comments (13)

  1. holger krekel

    Congrats to the nice issue number :)

    I think we will need a new option to allow for separation of dep-installs because other configurations may require the current default of installing multiple deps in one go (they might require each other).

    Maybe "dep_install = separate"? Or we could invent a qualifier for a dep-specification like this:

     deps = first:numpy
            scipy
            ...
    

    It more directly expressed the need for first installing numpy and i'd just slightly prefer this implementation. What do you think?

  2. Nathaniel Smith

    I think that with the way distutils/setuptools/distribute work, if you have two packages that depend on each other at install-time then they are actually impossible to install by any means. So I doubt you'll run into many such packages... Python package managers don't have a concept of separated install and setup phases like dpkg/rpm do.

    AFAIU, pip's logic is:

    • First, ask each package for its dependencies (this requires running setup.py, which is what causes the problems)
    • Then, pick a linear order to install them in, and install them just as if by doing multiple calls to pip.

    So a linear order should always be possible, since that's what pip does in any case...

    Now, there are probably tox.ini files in the wild that don't have dependencies listed in the proper order. I think in that case pip will just do its standard dependency resolution anyway, though -- if 'a' depends on 'b', and you do 'pip install a; pip install b' then the first call will install both 'a' and 'b', and the second call will be no-op. No big deal.

    Still, if you want to be conservative, maybe the easiest way would be to have a install_deps_sequentially=False|True option?

    (With the first: syntax, you may also have some trouble because : can occur in dependency names -- e.g. I have a dependency that is just a http://... URL.)

  3. holger krekel

    If we can default to separate installs without difficulty, it's of course preferrable. Tox could then guarantee to install separately in the order in which deps were specified. I can imagine a problem: consider dependencies that are specified as github/bitbucket addresses or files instead of PyPI distribution names. If they require each other then separate installs cannot work, can they?

    As to syntax, it's actually possible today to define groups by using different "pypi" servers that all point to the same one, something like this:

    [tox]
    indexserver = 
        g1 = http://pypi.python.org
        g2 = http://pypi.python.org
    
    [testenv]
    deps = :g1: numpy
               :g2: scipy
               ...
    

    This works because tox issues different pip invocations for each indexserver. We could extend this syntax to allow numbers directly, so that ":1:" does not need an "indexserver" entry but uses the default one, still maintaining the separation.

    On a sidenote, also telling this to myself, let's not get too pip-specific as there are people who would like to see a variant of tox runs that uses easy_install.

  4. Mikhail Korobov

    Just an idea: what about adding 'commands' to CreationConfig and allowing the execution of arbitrary commands right after virtualenv creation via some tox.ini option (something like 'post_creation_commands')?

  5. Nathaniel Smith

    On further thought, you're right -- there are situations where you have to pass multiple packages to 'pip' at the same time. Example: package "a" depends on "b", and vice-versa. Since this is a run-time dependency, not an install-time dependency, there is no problem. At the setup.py level, you have to install them sequentially, but you can do that in either order. At the 'pip' level, when you do 'pip install a' then it will automatically install b as well, so calling pip twice sequentially like 'pip install a; pip install b' will also work.

    But!

    If your dependencies are "a==1.0 b==1.2", then you have to pass them both to pip in a single call. If you do 'pip install a==1.0' first, then you'll get some random version of 'b', and vice-versa.

    New idea, very similar to Mikhail's: how about having a 'setup_commands=' option that defaults to 'pip install {deps}', but can be overridden. This would replace current deps processing, and have the same semantics (only run at virtualenv setup time, etc.)

  6. holger krekel

    I definitely want tox to grow more customizability for dep installation. If we can keep to a declarative way i'd prefer that, see the indexserver grouping above for an example. Declarations have the advantage that they usually can remain compatible through successive new versions and features of tox. And allowing imperative ways like in a "setup_commands" as a substitute for "pip install {deps}" raises problems, for example, how does this interact with indexserver grouping?

  7. Nathaniel Smith

    Fair enough. We still need some sort of reasonable declarative syntax for grouping together subsets of the deps, then. Some options

    deps1 = ...
    deps2 = ...
    
    deps = (numpy) (scipy) (nose coverage foo bar)
    
    deps = numpy | scipy | nose coverage foo bar
    

    (AFAICT neither () nor | are legal characters in requirements specifications, so they should be safe to use here: http://packages.python.org/distribute/pkg_resources.html#requirements-parsing)

    (EDIT: originally I had {} braces instead of (), but of course that clashes with tox's variable substitution syntax, duh.)

  8. Florian Rathgeber

    The workaround holger mentions doesn't work for me. Despite using 2 different (yet identical) index servers, tox still generates only a single pip invocation:

    cmdargs=[local('/tmp/foo/.tox/py27/bin/pip'), 'install', '-i', 'http://pypi.python.org', '--download-cache=/tmp/foo/.tox/_download', 'numpy>=1.6.0', 'h5py>=2.0
    .0']
    

    This is the dummy package I've been using: https://gist.github.com/ae38982d5a53de5eaf8b

  9. Florian Rathgeber

    holger krekel: Thanks, I'm not sure this is not an ideal workaround though: Isn't it a feature to group by actual value of the indexserver?

    I've now settled with a different, less than ideal solution: I only list numpy in the deps and explicitely install h5py in commands via 'pip install h5py>=2.0.0'

    See the updated gist: https://gist.github.com/ae38982d5a53de5eaf8b

    Installing h5py however still fails, because I need to tell it where to look for 'mpi.h' by exporting the environment variable 'C_INLCUDE_PATH' before calling pip. I couldn't find a way of exporting environment variables in 'tox.ini' since the commands aren't run in a shell. Could someone help out? Or should I file a separate bug?

  10. Marc Abramowitz

    Does it make sense to pursue adding an option to pip to make it "serialize" the installs?

    Because:

    • This issue is not limited to just tox.
    • I am leaning lately towards not listing deps directly in tox.ini and instead being more DRY by referencing a requirements file that I already have -- e.g.:
    deps = -r{toxinidir}/test.pipreq
    

    In the above case, tox wouldn't know how to serialize and only pip would.

  11. Log in to comment
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.