Debugging Pip Extras: A Deep Dive into .whl Files
Background
Python packaging has a nifty feature to include packages as an optional dependency for extended functionality. As an example, Amazon SageMaker released a new capability to use partner AI apps within SageMaker. We worked with the partner app teams to include sagemaker
as an optional dependency of their app SDK. Within AWS, you can enable the partner app and install the SDK to include SageMaker specific functionalities.
pip install partner_app_sdk[sagemaker]
The optional dependency is declared by defining an extras_require
section in the setup.py
or pyproject.toml
file.
Problem
An engineer on my team was testing a new version of a partner’s SDK. When they ran the command to upgrade the dependency, they encountered the following message:
> pip install partner_app_sdk[sagemaker] -U
Requirement already satisfied: partner_app_sdk[sagemaker] in /opt/conda/lib/python3.11/site-packages (0.1.5)
Collecting partner_app_sdk[sagemaker]
Using cached partner_app_sdk-0.2.2-py3-none-any.whl.metadata (514 bytes)
WARNING: partner_app_sdk 0.2.2 does not provide the extra 'sagemaker'
Despite the new version 0.2.2 being downloaded, pip warned that the sagemaker
extra was not provided. This was puzzling since the sagemaker
extra was expected to be part of the package.
The Exploration
To debug what was happening, we started by reviewing the setup.py
files for both the versions. The extras_required
section correctly listed sagemaker
as its dependency. So, why was pip unable to find sagemaker
in the new version? We explored the following options:
Could the pip cache be corrupted? This was easy to rule out. We cleared the cache and re-ran the update command. Didn’t resolve the issue.
Could it be a problem with the wheel
.whl
file?
Deep Dive into Wheel Files
A wheel (.whl
) is a binary distribution format for Python packages. It contains the package and its metadata, including the information about optional dependencies.
Inspecting the Wheel Files
To check the .whl
files, we downloaded it manually and examined the metadata inside the wheel.
> pip download partner_app_sdk==0.1.5 --no-deps
> unzip -p partner_app_sdk-0.1.5-py3-none-any.whl | grep -A 5 "Provides-Extra"
Provides-Extra: langchain
Requires-Dist: langchain; extra == "langchain"
Provides-Extra: sagemaker
Requires-Dist: sagemaker; extra == "sagemaker"
> pip download partner_app_sdk==0.2.2 --no-deps
> unzip -p partner_app_sdk-0.2.2-py3-none-any.whl | grep -A 5 "Provides-Extra"
Provides-Extra: langchain
Requires-Dist: langchain; extra == "langchain"
This revealed that sagemaker
was indeed missing from the .whl
metadata, despite being included in the setup.py
file! This explains why pip wasn’t able to find sagemaker
and issued the warning.
Root Cause?
We are yet to understand the root cause here. Most likely, the build process caused the metadata to be incomplete or incorrect in the wheel file. But now that we know this can happen, we will implement tests to validate the partner_app_sdk
.