diff --git a/pep-0705.rst b/pep-0705.rst new file mode 100644 index 00000000000..a6de5706a4b --- /dev/null +++ b/pep-0705.rst @@ -0,0 +1,98 @@ +PEP: 705 +Title: Recording the provenance of installed packages +Author: Fridolin Pokorny , + Trishank Karthik Kuppusamy , +Sponsor: +PEP-Delegate: +Discussions-To: https://discuss.python.org/t/pep-705-recording-provenance-of-installed-packages/23340 +Status: Draft +Type: Process +Content-Type: text/x-rst +Created: 30-Jan-2002 +Post-History: `03-Dec-2021 `__, + `30-Jan-2023 `__, + + +Abstract +======== + +This PEP describes a way to record the provenance of Python packages installed. +The record is created by an installer and is available to users in the form of a +JSON file ``direct_url.json`` in the ``.dist-info`` directory. The PEP is an +extension to :pep:`610` for cases when installed packages come from a +package index. + + +Motivation +========== + +Installing a Python package involves downloading the package from a source and +extracting its content to an appropriate place. After the installation process +is done, information about the artifact used as well as its source is generally +lost. Nevertheless, there are use cases for keeping records of artifacts used +for installing packages and their provenance. + +Python wheels can be built with different compiler flags or supporting +different wheel tags. In both cases, users might get into a situation in which +multiple wheels might be considered by installers (possibly from different +package indexes) and immediately finding out which artifact was actually used +during the installation might be helpful. This way, tools reporting software +installed, such as tools reporting a software bill of Materials (SBOM), might give +more accurate reports. + +The motivation described in this PEP is a direct extension to :pep:`610`. +Besides stating information about packages installed from a URL, installers +SHOULD record information also for packages installed from Python package +indexes or from the filesystem. + + +Examples +======== + +An example of a ``direct_url.json`` file in the ``.dist-info`` directory +alongside files stated in :pep:`376` and further adjusted in :pep:`627`. The +specified file stores a JSON describing artifact used to install a package as +well as its source. The record is similar to ``direct_url.json`` described in +:pep:`610`: + +.. code-block:: json + + { + "archive_info": { + "hash": "sha256=714ac14496c3e68c99c29b00845f7a2b85f3bb6f1078fd9f72fd20f0570002b2" + }, + "url": "https://files.pythonhosted.org/packages/ed/35/a31aed2993e398f6b09a790a181a7927eb14610ee8bbf02dc14d31677f1c/packaging-23.0-py3-none-any.whl" + } + +If a source distribution was used to build a wheel file which was subsequently +installed, the ``url`` MUST state URL to the source distribution used. + +For cases when a package is installed from a local directory, +``direct_url.json`` SHOULD preserve path to the file used: + +.. code-block:: json + + { + "archive_info": { + "hash": "sha256=b9c46cc36662a7949f34b52d8ec7bb59c0d74ba08ba6cb9ce9adc1d8676d9526" + }, + "url": "file:///home/user/wheels/Flask-2.2.2-py3-none-any.whl" + } + +For installations when a package is installed by providing a URL, :pep:`610` is +still applicable. + +In both cases, the JSON document is stating the following entries: + +* ``archive_info.hash`` MUST be present with a value of ``=``, + currently supported hash algorithm is only ``sha256``. + +* ``url`` - MUST be present and points to the source from where the package was obtained. + The value MUST be stripped of any sensitive authentication information, for security + reasons. + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive.