From 2a71e6edef4a0b5f026c3f6962b474e754b88f6f Mon Sep 17 00:00:00 2001 From: Fridolin Pokorny Date: Wed, 25 Jan 2023 20:25:14 +0100 Subject: [PATCH 1/2] PEP 705: Recording provenance of installed packages Signed-off-by: Fridolin Pokorny --- pep-0705.rst | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 96 insertions(+) create mode 100644 pep-0705.rst diff --git a/pep-0705.rst b/pep-0705.rst new file mode 100644 index 00000000000..7ebe2112f7d --- /dev/null +++ b/pep-0705.rst @@ -0,0 +1,96 @@ +PEP: 705 +Title: Recording provenance of installed packages +Author: Fridolin Pokorny , Trishank Karthik Kuppusamy +Sponsor: +PEP-Delegate: +Discussions-To: https://discuss.python.org/t/pep-705-recording-provenance-of-installed-packages/23340 +Status: Draft +Type: Process +Content-Type: text/x-rst +Created: 30-Jan-2002 +Post-History: + + +Abstract +======== + +This PEP describes a way to record provenance of Python packages installed. +The record is created by an installer and is available to users in a form of a +JSON file ``direct_url.json`` in ``.dist-info`` directory. The PEP is an +extension to :pep:`610` for cases when installed packages are comming from a +package index. + + +Motivation +========== + +Installing a Python package involves downloading the package from a source and +extracting its content to an appropriate place. After the installation process +is done, information about the artifact used as well as its source is generally +lost. Nevertheless, there are use cases for keeping records of artifacts used +for installing packages and their provenance. + +Python wheels can be built with different compiler flags or supporting +different wheel tags. In both cases, users might get into a situation in which +multiple wheels might be considered by installers (possibly from different +package indexes) and immediately finding out which artifact was actually used +during the installation might be helpful. This way, tools reporting software +installed, such as tools reporting SBOM (Software Bill of Material), might give +more accurate reports. + +The motivation described in this PEP is a direct extension to :pep:`610`. +Besides stating information about packages installed from an URL, installers +SHOULD record information also for packages installed from Python package +indexes or from filesystem. + + +Examples +======== + +An example of a ``direct_url.json`` file in the ``.dist-info`` directory +alongside files stated in :pep:`376` and further adjusted in :pep:`627`. The +specified file stores a JSON describing artifact used to install a package as +well as its source. The record is similar to ``direct_url.json`` described in +:pep:`610`: + +.. code:: + + { + "archive_info": { + "hash": "sha256=714ac14496c3e68c99c29b00845f7a2b85f3bb6f1078fd9f72fd20f0570002b2" + }, + "url": "https://files.pythonhosted.org/packages/ed/35/a31aed2993e398f6b09a790a181a7927eb14610ee8bbf02dc14d31677f1c/packaging-23.0-py3-none-any.whl" + } + +If a source distribution was used to build a wheel file which was subsequently +installed, the ``url`` must state URL to the source distribution used. + +For cases when a package is installed from a local directory, +``direct_url.json`` SHOULD preserve path to the file used: + +.. code:: + + { + "archive_info": { + "hash": "sha256=b9c46cc36662a7949f34b52d8ec7bb59c0d74ba08ba6cb9ce9adc1d8676d9526" + }, + "url": "file:///home/user/wheels/Flask-2.2.2-py3-none-any.whl" + } + +For installations when a package is installed by providing an URL, :pep:`610` is +still applicable. + +In both cases, the JSON document is stating the following entries: + +* ``archive_info.hash`` MUST be present with a value of ``=``, + currently supported hash algorithm is only ``sha256``. + +* ``url`` - MUST be present and points to source from where the package was obtained. + The value MUST be stripped of any sensitive authentication information, for security + reasons. + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive. From fa325938f3ecc519f2dbea4f688e6f6cd0dd623e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fridol=C3=ADn=20Pokorn=C3=BD?= Date: Tue, 31 Jan 2023 09:49:53 +0100 Subject: [PATCH 2/2] Apply suggestions from code review Co-authored-by: C.A.M. Gerlach Co-authored-by: Hugo van Kemenade --- pep-0705.rst | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/pep-0705.rst b/pep-0705.rst index 7ebe2112f7d..a6de5706a4b 100644 --- a/pep-0705.rst +++ b/pep-0705.rst @@ -1,6 +1,7 @@ PEP: 705 -Title: Recording provenance of installed packages -Author: Fridolin Pokorny , Trishank Karthik Kuppusamy +Title: Recording the provenance of installed packages +Author: Fridolin Pokorny , + Trishank Karthik Kuppusamy , Sponsor: PEP-Delegate: Discussions-To: https://discuss.python.org/t/pep-705-recording-provenance-of-installed-packages/23340 @@ -8,16 +9,17 @@ Status: Draft Type: Process Content-Type: text/x-rst Created: 30-Jan-2002 -Post-History: +Post-History: `03-Dec-2021 `__, + `30-Jan-2023 `__, Abstract ======== -This PEP describes a way to record provenance of Python packages installed. -The record is created by an installer and is available to users in a form of a -JSON file ``direct_url.json`` in ``.dist-info`` directory. The PEP is an -extension to :pep:`610` for cases when installed packages are comming from a +This PEP describes a way to record the provenance of Python packages installed. +The record is created by an installer and is available to users in the form of a +JSON file ``direct_url.json`` in the ``.dist-info`` directory. The PEP is an +extension to :pep:`610` for cases when installed packages come from a package index. @@ -35,13 +37,13 @@ different wheel tags. In both cases, users might get into a situation in which multiple wheels might be considered by installers (possibly from different package indexes) and immediately finding out which artifact was actually used during the installation might be helpful. This way, tools reporting software -installed, such as tools reporting SBOM (Software Bill of Material), might give +installed, such as tools reporting a software bill of Materials (SBOM), might give more accurate reports. The motivation described in this PEP is a direct extension to :pep:`610`. -Besides stating information about packages installed from an URL, installers +Besides stating information about packages installed from a URL, installers SHOULD record information also for packages installed from Python package -indexes or from filesystem. +indexes or from the filesystem. Examples @@ -53,7 +55,7 @@ specified file stores a JSON describing artifact used to install a package as well as its source. The record is similar to ``direct_url.json`` described in :pep:`610`: -.. code:: +.. code-block:: json { "archive_info": { @@ -63,12 +65,12 @@ well as its source. The record is similar to ``direct_url.json`` described in } If a source distribution was used to build a wheel file which was subsequently -installed, the ``url`` must state URL to the source distribution used. +installed, the ``url`` MUST state URL to the source distribution used. For cases when a package is installed from a local directory, ``direct_url.json`` SHOULD preserve path to the file used: -.. code:: +.. code-block:: json { "archive_info": { @@ -77,7 +79,7 @@ For cases when a package is installed from a local directory, "url": "file:///home/user/wheels/Flask-2.2.2-py3-none-any.whl" } -For installations when a package is installed by providing an URL, :pep:`610` is +For installations when a package is installed by providing a URL, :pep:`610` is still applicable. In both cases, the JSON document is stating the following entries: @@ -85,7 +87,7 @@ In both cases, the JSON document is stating the following entries: * ``archive_info.hash`` MUST be present with a value of ``=``, currently supported hash algorithm is only ``sha256``. -* ``url`` - MUST be present and points to source from where the package was obtained. +* ``url`` - MUST be present and points to the source from where the package was obtained. The value MUST be stripped of any sensitive authentication information, for security reasons.