Warm tip: This article is reproduced from serverfault.com, please click

What are the pitfalls of exclusively using PIP in a CONDA environment?

发布于 2020-12-01 18:00:25

Background

The official documentation and this blog in the same website - recommend to install as many requirements as possible with conda then use pip. Apparently this is because conda will be unaware of any changes to the dependencies made by pip and therefore will not be able to resolve dependencies correctly.

Question

Now if one exclusively uses pip and go without installing anything with conda, it seems reasonable to expect conda does not need to be aware of any changes made by pip - as conda effectively becomes a mere tool to isolate dependencies and manage versions. However, this goes against official recommendation as one will NOT install as many requirements as possible with conda.

So the question remains: is there any known drawback from exclusively using pip in a conda environment?

Similar Topics

A similar topic in has been touched a bit in here but does not cover the case of exclusively using pip in a conda environment. I have also been here:

Questioner
Leonardus Chen
Viewed
0
merv 2020-12-02 06:32:48

Not sure one can give a comprehensive answer on this, but some of the major things that come to mind are:

  1. Lack of deep support for non-Python dependency resolution. While more wheels that bundle non-Python resources have become available over time, it is nowhere near the coverage that Conda provides by being a general package manager rather than Python-specific. For anyone doing interoperable computing (e.g., reticulate), I would expect Conda to be favored.

  2. Optimized libraries. Sort of related to the first point, but the Anaconda team has made an effort to build optimized versions of packages (e.g., MKL for numpy). Not sure if the equivalent is available through PyPI.1

  3. Wasteful redundancy across environments. Conda uses hardlinking when packages and environments are on the same volume, and supports softlinking for spanning across volumes. This helps to minimize replicating any packages that are installed in multiple environments.

  4. Complicates exporting. When exporting (conda env export) Conda doesn't pick up all pip-installed packages - only the ones that come from PyPI. That is, it'll miss things installed from GitHub, etc.. If one did go the pip-only route, I think a more reliable export strategy would be to use pip freeze > requirements.txt, and then make a YAML like

    channels:
      - defaults
    dependencies:
      - python=3.8  # specify the version
      - pip
      - pip:
        - -r requirements.txt
    

    with which to recreate the environment.

All that said, I could easily imagine that none of these matter to some people (most of them are conveniences), especially those who tend to work purely in Python. In such cases, however, I don't see why one would not simply forgo Conda altogether and use a Python-specific virtual environment manager.


[1] Someone please correct me if you know otherwise.