Managing python dependencies in a reliable way

Are you running your development environment with different packages than production? And your tests? Or even different parts of production?

The common advice (example) is to pin versions of packages in a requirements.txt file (that means defining which particular version works for your program). That advice fails because the most common way to pin your dependencies is installing the dependencies on a virtual environment and do a pip freeze > requirements.txt. The problems with this are multiple:

  1. The environment might have broken dependencies. pip install only checks the dependency requirements for the last package installed, so that's the only one guaranteed to work.
  2. That environment will have the direct dependencies and, recursively, the dependencies of dependencies. It will be difficult to identify what your app actually requires in the future (particularly when checking for version conflicts on dependencies of dependencies).
This is not the only solution for this, you can use pipconflictchecker to deal with the first problem, or you can use pipenv, but I've been well served with pip-tools for years, so I thought I'd share my workflow and how it solves the problems above.

First you need to install pip-tools (preferably inside your app's virtual environment): pip install pip-tools

Then you need to create the file requirements.in, which has mostly the same format as a requirements.txt file. It will contain the packages and specific version your app requires. Usually I avoid setting versions in it, unless there's a known issue with a specific package (and in that case the previous line will a commment and a link to the issue). This is the file you're going to be creating/updating, all dependencies here should be used directly by your app.

The next step is "compiling" the requirements with pip-compile. This command will fetch the dependency information for each package and calculate a compatible set of dependencies that respects all the dependencies of all packages and their dependencies. The result is saved on a file called requirements.txt. You'll notice that, for dependencies of dependencies, pip-compile generates comments explaining which packages are requiring it.

Once you have the requirements.txt file you can install it with pip install -r requirements.txt but I prefer pip-tools' pip-sync which not only installs the packages but also uninstalls any package that is not on the requirements.txt file.

I also include the development/test dependencies in the requirements.in. That's the only way to guarantee that the dependency set on requirements.txt has the same versions that are tested. It's possible to pick up a requirements-dev.txt subset of requirements.txt and do a pip uninstall -r requirements-dev.txt before deploying to production, but even I'll admit I only do that for more complex projects.

With this environment I can use then the following cheatsheet:

Action Instructions
Add a package as a dependency. Add the package name in requirements.in. Run pip-compile.
Remove a dependency the app no longer uses. Remove the package name from requirements.in. Run pip-compile.
Upgrade dependencies to the current versions. Run pip-compile --upgrade. (1)
Update the local environment. Run pip-sync.
Deploy Run pip-sync on the server/docker image build.

(1) I usually use this to create pull-requests to run the tests and QA the changes weekly.

links

social