Improve Your Python Dependency Management with pip-tools
As an engineer who loves to solve problems using Python and creates tens of projects by year, keeping track of different packages and their versions can be complex. I often have to test and deploy my projects in different environments (development/testing/production machines) and on different Cloud or PaaS providers and need to be sure that all of them will use exactly the same Python packages and versions as I have used in the development, to be sure no problems were introduced by an unexpected package upgrade.
A naive approach would be to install the packages you need using pip
and at the end generate a requirements.txt
file with
pip freeze > requirements.txt
to persist all dependencies (with their versions) that got installed.We can later install into an empty virtual environment using that requirements file gets us the same packages and versions.
But this approch is problematic beause is virtually impossible to know which packages are needed by our application and which were pulled in as dependencies:
suppose I choose to switch from
pandas
topolars
in my project. In that scenario, therequirements.txt
file may have manypandas
dependencies that are superfluous topolars
. Unfortunately, we are uncertain of who they are.if you need to upgrade some of the packages (i.e
Django
version from3.2
to4.2
), whichDjango
dependencies need to be upgrated and to which version? Which new dependencies the newDjango
version have?
That is why I use the pip-tools project to simplify the process of package dependency management. In this blog post, we will discuss what pip-tools
does, the problem it solves and why developers should consider using it, and provide five examples using pip-tools
for managing Python project dependencies. Additionally, we will explore some alternatives to pip-tools
.
What is pip-tools?
pip-tools
is an open-source project created to simplify the requirements files management used in Python projects. It takes a project’s dependencies and recursively generates pinned version requirement files.
Moreover, pip-tools
allows you to easily manage your project dependencies with minimum effort and ensures you have a reproducible development environment. It allows other developers that are working on the same project to have identical dependencies to yours, with no version conflicts.
This can help reduce time spent debugging issues related to package conflicts and ensure that code runs consistently across different environments.
- It streamlines the generation and maintenance of requirements files
- It ensures that all packages are pinned to a specific version to avoid version conflicts
- It provides an easier way to manage different environments, like production, development, and testing
- It permits simple package updating and upgrading as new versions become available
Example of using pip-tools
Here are five examples of how developers can use ppip-tools
to manage the package requirements of their Python projects:
Basic usage of pip-tools with requirements files
Developers can use pip-tools to manage their project dependencies by creating a requirements.in
file with the required packages and their respective versions.
Listing 1: requirements.in
pandas>=2.0
scikit-learn xgboost==1.7.6
Developers can then run the following command to generate the required requirements.txt
file:
$ pip-compile requirements.in
This generates a pinned version requirement file with all the packages and their respective dependencies:
Listing 2: Generated requirements.txt
#
# This file is autogenerated by pip-compile with Python 3.10
# by the following command:
#
# pip-compile requirements.in
#
joblib==1.3.1
# via scikit-learn
numpy==1.25.1
# via
# pandas
# scikit-learn
# scipy
# xgboost
pandas==2.0.3
# via -r requirements.in
python-dateutil==2.8.2
# via pandas
pytz==2023.3
# via pandas
scikit-learn==1.3.0
# via -r requirements.in
scipy==1.11.1
# via
# scikit-learn
# xgboost
six==1.16.0
# via python-dateutil
threadpoolctl==3.2.0
# via scikit-learn
tzdata==2023.3
# via pandas
xgboost==1.7.6 # via -r requirements.in
pip-tools with multiple environments
We can define several files with libraries to use in multiple environments, such as development, production, and testing. We can create corresponding dev-requirements.in
, prod-requirements.in
, and test-requirements.in
files and then use the following commands to generate their respective files:
$ pip-compile dev-requirements.in
$ pip-compile prod-requirements.in
$ pip-compile test-requirements.in
These commands will generate dev-requirements.txt
, prod-requirements.txt
, and test-requirements.txt
files with the corresponding dependencies.
pip-tools with custom package indexes
We can also use pip-tools with package indexes different from the official PyPI index. To do this, we can specify a custom index in their requirements.in
file, like this:
--index-url https://custompackageindex.com/
django==3.2.5
Then, running pip-compile requirements.in
will output a requirements.txt
file with the packages pinned to versions on the custom package index.
pip-tools with hash dependencies
If you want to pin your dependencies to an specific binary wheel compilation hash (and not only to a package-version) you can use following command:
$ pip-compile --generate-hashes requirements.in
This will add a hash value to each package in the requirements file, including the transitive dependencies. This is usefull to increase security as PyPI has “made the deliberate choice to allow wheel files to be added to old releases”
Show requirements.txt
file with generated hashes
# This file is autogenerated by pip-compile with Python 3.10
# by the following command:
#
# pip-compile --generate-hashes requirements.in
#
joblib==1.3.1 \
--hash=sha256:1f937906df65329ba98013dc9692fe22a4c5e4a648112de500508b18a21b41e3 \
--hash=sha256:89cf0529520e01b3de7ac7b74a8102c90d16d54c64b5dd98cafcd14307fdf915
# via scikit-learn
numpy==1.25.1 \
--hash=sha256:012097b5b0d00a11070e8f2e261128c44157a8689f7dedcf35576e525893f4fe \
--hash=sha256:0d3fe3dd0506a28493d82dc3cf254be8cd0d26f4008a417385cbf1ae95b54004 \
--hash=sha256:0def91f8af6ec4bb94c370e38c575855bf1d0be8a8fbfba42ef9c073faf2cf19 \
--hash=sha256:1a180429394f81c7933634ae49b37b472d343cccb5bb0c4a575ac8bbc433722f \
--hash=sha256:1d5d3c68e443c90b38fdf8ef40e60e2538a27548b39b12b73132456847f4b631 \
--hash=sha256:20e1266411120a4f16fad8efa8e0454d21d00b8c7cee5b5ccad7565d95eb42dd \
--hash=sha256:247d3ffdd7775bdf191f848be8d49100495114c82c2bd134e8d5d075fb386a1c \
--hash=sha256:35a9527c977b924042170a0887de727cd84ff179e478481404c5dc66b4170009 \
--hash=sha256:38eb6548bb91c421261b4805dc44def9ca1a6eef6444ce35ad1669c0f1a3fc5d \
--hash=sha256:3d7abcdd85aea3e6cdddb59af2350c7ab1ed764397f8eec97a038ad244d2d105 \
--hash=sha256:41a56b70e8139884eccb2f733c2f7378af06c82304959e174f8e7370af112e09 \
--hash=sha256:4a90725800caeaa160732d6b31f3f843ebd45d6b5f3eec9e8cc287e30f2805bf \
--hash=sha256:6b82655dd8efeea69dbf85d00fca40013d7f503212bc5259056244961268b66e \
--hash=sha256:6c6c9261d21e617c6dc5eacba35cb68ec36bb72adcff0dee63f8fbc899362588 \
--hash=sha256:77d339465dff3eb33c701430bcb9c325b60354698340229e1dff97745e6b3efa \
--hash=sha256:791f409064d0a69dd20579345d852c59822c6aa087f23b07b1b4e28ff5880fcb \
--hash=sha256:9a3a9f3a61480cc086117b426a8bd86869c213fc4072e606f01c4e4b66eb92bf \
--hash=sha256:c1516db588987450b85595586605742879e50dcce923e8973f79529651545b57 \
--hash=sha256:c40571fe966393b212689aa17e32ed905924120737194b5d5c1b20b9ed0fb171 \
--hash=sha256:d412c1697c3853c6fc3cb9751b4915859c7afe6a277c2bf00acf287d56c4e625 \
--hash=sha256:d5154b1a25ec796b1aee12ac1b22f414f94752c5f94832f14d8d6c9ac40bcca6 \
--hash=sha256:d736b75c3f2cb96843a5c7f8d8ccc414768d34b0a75f466c05f3a739b406f10b \
--hash=sha256:e8f6049c4878cb16960fbbfb22105e49d13d752d4d8371b55110941fb3b17800 \
--hash=sha256:f76aebc3358ade9eacf9bc2bb8ae589863a4f911611694103af05346637df1b7 \
--hash=sha256:fd67b306320dcadea700a8f79b9e671e607f8696e98ec255915c0c6d6b818503
# via
# pandas
# scikit-learn
# scipy
# xgboost
pandas==2.0.3 \
--hash=sha256:04dbdbaf2e4d46ca8da896e1805bc04eb85caa9a82e259e8eed00254d5e0c682 \
--hash=sha256:1168574b036cd8b93abc746171c9b4f1b83467438a5e45909fed645cf8692dbc \
--hash=sha256:1994c789bf12a7c5098277fb43836ce090f1073858c10f9220998ac74f37c69b \
--hash=sha256:258d3624b3ae734490e4d63c430256e716f488c4fcb7c8e9bde2d3aa46c29089 \
--hash=sha256:32fca2ee1b0d93dd71d979726b12b61faa06aeb93cf77468776287f41ff8fdc5 \
--hash=sha256:37673e3bdf1551b95bf5d4ce372b37770f9529743d2498032439371fc7b7eb26 \
--hash=sha256:3ef285093b4fe5058eefd756100a367f27029913760773c8bf1d2d8bebe5d210 \
--hash=sha256:5247fb1ba347c1261cbbf0fcfba4a3121fbb4029d95d9ef4dc45406620b25c8b \
--hash=sha256:5ec591c48e29226bcbb316e0c1e9423622bc7a4eaf1ef7c3c9fa1a3981f89641 \
--hash=sha256:694888a81198786f0e164ee3a581df7d505024fbb1f15202fc7db88a71d84ebd \
--hash=sha256:69d7f3884c95da3a31ef82b7618af5710dba95bb885ffab339aad925c3e8ce78 \
--hash=sha256:6a21ab5c89dcbd57f78d0ae16630b090eec626360085a4148693def5452d8a6b \
--hash=sha256:81af086f4543c9d8bb128328b5d32e9986e0c84d3ee673a2ac6fb57fd14f755e \
--hash=sha256:9e4da0d45e7f34c069fe4d522359df7d23badf83abc1d1cef398895822d11061 \
--hash=sha256:9eae3dc34fa1aa7772dd3fc60270d13ced7346fcbcfee017d3132ec625e23bb0 \
--hash=sha256:9ee1a69328d5c36c98d8e74db06f4ad518a1840e8ccb94a4ba86920986bb617e \
--hash=sha256:b084b91d8d66ab19f5bb3256cbd5ea661848338301940e17f4492b2ce0801fe8 \
--hash=sha256:b9cb1e14fdb546396b7e1b923ffaeeac24e4cedd14266c3497216dd4448e4f2d \
--hash=sha256:ba619e410a21d8c387a1ea6e8a0e49bb42216474436245718d7f2e88a2f8d7c0 \
--hash=sha256:c02f372a88e0d17f36d3093a644c73cfc1788e876a7c4bcb4020a77512e2043c \
--hash=sha256:ce0c6f76a0f1ba361551f3e6dceaff06bde7514a374aa43e33b588ec10420183 \
--hash=sha256:d9cd88488cceb7635aebb84809d087468eb33551097d600c6dad13602029c2df \
--hash=sha256:e4c7c9f27a4185304c7caf96dc7d91bc60bc162221152de697c98eb0b2648dd8 \
--hash=sha256:f167beed68918d62bffb6ec64f2e1d8a7d297a038f86d4aed056b9493fca407f \
--hash=sha256:f3421a7afb1a43f7e38e82e844e2bca9a6d793d66c1a7f9f0ff39a795bbc5e02
# via -r requirements.in
python-dateutil==2.8.2 \
--hash=sha256:0123cacc1627ae19ddf3c27a5de5bd67ee4586fbdd6440d9748f8abb483d3e86 \
--hash=sha256:961d03dc3453ebbc59dbdea9e4e11c5651520a876d0f4db161e8674aae935da9
# via pandas
pytz==2023.3 \
--hash=sha256:1d8ce29db189191fb55338ee6d0387d82ab59f3d00eac103412d64e0ebd0c588 \
--hash=sha256:a151b3abb88eda1d4e34a9814df37de2a80e301e68ba0fd856fb9b46bfbbbffb
# via pandas
scikit-learn==1.3.0 \
--hash=sha256:0e8102d5036e28d08ab47166b48c8d5e5810704daecf3a476a4282d562be9a28 \
--hash=sha256:151ac2bf65ccf363664a689b8beafc9e6aae36263db114b4ca06fbbbf827444a \
--hash=sha256:1d54fb9e6038284548072df22fd34777e434153f7ffac72c8596f2d6987110dd \
--hash=sha256:3a11936adbc379a6061ea32fa03338d4ca7248d86dd507c81e13af428a5bc1db \
--hash=sha256:436aaaae2c916ad16631142488e4c82f4296af2404f480e031d866863425d2a2 \
--hash=sha256:552fd1b6ee22900cf1780d7386a554bb96949e9a359999177cf30211e6b20df6 \
--hash=sha256:6a885a9edc9c0a341cab27ec4f8a6c58b35f3d449c9d2503a6fd23e06bbd4f6a \
--hash=sha256:7617164951c422747e7c32be4afa15d75ad8044f42e7d70d3e2e0429a50e6718 \
--hash=sha256:79970a6d759eb00a62266a31e2637d07d2d28446fca8079cf9afa7c07b0427f8 \
--hash=sha256:850a00b559e636b23901aabbe79b73dc604b4e4248ba9e2d6e72f95063765603 \
--hash=sha256:8be549886f5eda46436b6e555b0e4873b4f10aa21c07df45c4bc1735afbccd7a \
--hash=sha256:981287869e576d42c682cf7ca96af0c6ac544ed9316328fd0d9292795c742cf5 \
--hash=sha256:9877af9c6d1b15486e18a94101b742e9d0d2f343d35a634e337411ddb57783f3 \
--hash=sha256:998d38fcec96584deee1e79cd127469b3ad6fefd1ea6c2dfc54e8db367eb396b \
--hash=sha256:9d953531f5d9f00c90c34fa3b7d7cfb43ecff4c605dac9e4255a20b114a27369 \
--hash=sha256:ae80c08834a473d08a204d966982a62e11c976228d306a2648c575e3ead12111 \
--hash=sha256:c470f53cea065ff3d588050955c492793bb50c19a92923490d18fcb637f6383a \
--hash=sha256:c7e28d8fa47a0b30ae1bd7a079519dd852764e31708a7804da6cb6f8b36e3630 \
--hash=sha256:ded35e810438a527e17623ac6deae3b360134345b7c598175ab7741720d7ffa7 \
--hash=sha256:ee04835fb016e8062ee9fe9074aef9b82e430504e420bff51e3e5fffe72750ca \
--hash=sha256:fd6e2d7389542eae01077a1ee0318c4fec20c66c957f45c7aac0c6eb0fe3c612
# via -r requirements.in
scipy==1.11.1 \
--hash=sha256:08d957ca82d3535b3b9ba6c8ff355d78fe975271874e2af267cb5add5bd78625 \
--hash=sha256:249cfa465c379c9bb2c20123001e151ff5e29b351cbb7f9c91587260602c58d0 \
--hash=sha256:366a6a937110d80dca4f63b3f5b00cc89d36f678b2d124a01067b154e692bab1 \
--hash=sha256:39154437654260a52871dfde852adf1b93b1d1bc5dc0ffa70068f16ec0be2624 \
--hash=sha256:396fae3f8c12ad14c5f3eb40499fd06a6fef8393a6baa352a652ecd51e74e029 \
--hash=sha256:3b9963798df1d8a52db41a6fc0e6fa65b1c60e85d73da27ae8bb754de4792481 \
--hash=sha256:3e8eb42db36526b130dfbc417609498a6192381abc1975b91e3eb238e0b41c1a \
--hash=sha256:512fdc18c65f76dadaca139348e525646d440220d8d05f6d21965b8d4466bccd \
--hash=sha256:aec8c62fbe52914f9cf28d846cf0401dd80ab80788bbab909434eb336ed07c04 \
--hash=sha256:b41a0f322b4eb51b078cb3441e950ad661ede490c3aca66edef66f4b37ab1877 \
--hash=sha256:b4bb943010203465ac81efa392e4645265077b4d9e99b66cf3ed33ae12254173 \
--hash=sha256:b588311875c58d1acd4ef17c983b9f1ab5391755a47c3d70b6bd503a45bfaf71 \
--hash=sha256:ba94eeef3c9caa4cea7b402a35bb02a5714ee1ee77eb98aca1eed4543beb0f4c \
--hash=sha256:be8c962a821957fdde8c4044efdab7a140c13294997a407eaee777acf63cbf0c \
--hash=sha256:cce154372f0ebe88556ed06d7b196e9c2e0c13080ecb58d0f35062dc7cc28b47 \
--hash=sha256:d51565560565a0307ed06fa0ec4c6f21ff094947d4844d6068ed04400c72d0c3 \
--hash=sha256:e866514bc2d660608447b6ba95c8900d591f2865c07cca0aa4f7ff3c4ca70f30 \
--hash=sha256:fb5b492fa035334fd249f0973cc79ecad8b09c604b42a127a677b45a9a3d4289 \
--hash=sha256:ffb28e3fa31b9c376d0fb1f74c1f13911c8c154a760312fbee87a21eb21efe31
# via
# scikit-learn
# xgboost
six==1.16.0 \
--hash=sha256:1e61c37477a1626458e36f7b1d82aa5c9b094fa4802892072e49de9c60c4c926 \
--hash=sha256:8abb2f1d86890a2dfb989f9a77cfcfd3e47c2a354b01111771326f8aa26e0254
# via python-dateutil
threadpoolctl==3.2.0 \
--hash=sha256:2b7818516e423bdaebb97c723f86a7c6b0a83d3f3b0970328d66f4d9104dc032 \
--hash=sha256:c96a0ba3bdddeaca37dc4cc7344aafad41cdb8c313f74fdfe387a867bba93355
# via scikit-learn
tzdata==2023.3 \
--hash=sha256:11ef1e08e54acb0d4f95bdb1be05da659673de4acbd21bf9c69e94cc5e907a3a \
--hash=sha256:7e65763eef3120314099b6939b5546db7adce1e7d6f2e179e3df563c70511eda
# via pandas
xgboost==1.7.6 \
--hash=sha256:127cf1f5e2ec25cd41429394c6719b87af1456ce583e89f0bffd35d02ad18bcb \
--hash=sha256:1c527554a400445e0c38186039ba1a00425dcdb4e40b37eed0e74cb39a159c47 \
--hash=sha256:281c3c6f4fbed2d36bf95cd02a641afa95e72e9abde70064056da5e76233e8df \
--hash=sha256:4c34675b4d2678c624ddde5d45361e7e16046923e362e4e609b88353e6b87124 \
--hash=sha256:59b4b366d2cafc7f645e87d897983a5b59be02876194b1d213bd8d8b811d8ce8 \
--hash=sha256:b1d5db49b199152d62bd9217c98760207d3de86d2b9d243260c573ffe638f80a
# via -r requirements.in
pip-tools to update dependencies
Finally, pip-tools can also help developers manage package updates easily. They can run the following command to generate a requirements.txt
file with the new versions of the packages:
$ pip-compile --upgrade requirements.in
This command will identify the available updates and upgrade the packages listed in the requirements file.
pip-sync
pip-sync
is a tool provided by pip-tools
that ensures your virtual environment only contains the packages you have explicitly listed in your requirements.txt
file. This is useful because it prevents conflicts between versions of packages in your virtual environment and ensures you only install the packages necessary for your project removing any previously installed packages that are not needed anymore.
Using pip-sync
is a good practice to ensure your project dependencies are isolated from other projects on your development machine. It prevents conflicts between different versions of packages, making your project more reliable and robust.
Once you have your requirements.txt
file, run the command:
$ pip-sync
This command will install only the packages listed in requirements.txt
and their dependencies, and remove any packages that are not listed. This ensures that you have an isolated environment with only the packages required by your project.
Example: replacing pandas with polars
If we decide to move from pandas
to polars
we update our requirement.in file:
Listing 3: requirements.in after replacing pandas with polars
polars
scikit-learn xgboost==1.7.6
and then run:
pip-compile
pip-sync
we will observe pip-sinc
doing its magic:
Found existing installation: pandas 2.0.3
Uninstalling pandas-2.0.3:
Successfully uninstalled pandas-2.0.3
Found existing installation: python-dateutil 2.8.2
Uninstalling python-dateutil-2.8.2:
Successfully uninstalled python-dateutil-2.8.2
Found existing installation: pytz 2023.3
Uninstalling pytz-2023.3:
Successfully uninstalled pytz-2023.3
Found existing installation: six 1.16.0
Uninstalling six-1.16.0:
Successfully uninstalled six-1.16.0
Found existing installation: tzdata 2023.3
Uninstalling tzdata-2023.3:
Successfully uninstalled tzdata-2023.3
Collecting polars==0.18.9 (from -r /tmp/tmpwq_lrber (line 1))
Obtaining dependency information for polars==0.18.9 from https://files.pythonhosted.org/packages/32/b7/bb1faf1741235f1147408322d1cd845e29d195e2e4dc636a3dea8b4ea119/polars-0.18.9-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
Using cached polars-0.18.9-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (14 kB)
Using cached polars-0.18.9-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.3 MB)
Installing collected packages: polars Successfully installed polars-0.18.9
Notice how it removed all packages required only by pandas
(see Listing 2) and then installed polars
.
Alternatives to pip-tools
While pip-tools is a popular and powerful tool for managing dependencies, developers can also use alternatives such as Poetry, Pipenv, conda, and setuptools. These tools all have slightly different approaches to the problem of managing package dependencies in Python projects but pip-tools
has the advantage of being simpler and more lightweight while for instance poetry
is preferable if you also want a more complete set of features for managing not only package dependencies, but also virtual environments, package building, and publishing.
Conclusion
pip-tools provides developers with an elegant and straightforward way to manage package dependencies in their Python projects. With its advantages such as generating pinned versions, managing multiple environments, and identifying available package updates, it simplifies the work of maintaining Python projects. By using pip-tools with different commands like pip-compile
and pip-sync
, developers can effectively solve the problem of managing their project dependencies.