From 4be119b7f88990bdc98d156118e67280e35e49f7 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Mon, 21 Mar 2022 18:29:38 -0500 Subject: [PATCH 1/8] PEP 630: Remove no-longer-nessesary Emacs config block --- pep-0630.rst | 9 --------- 1 file changed, 9 deletions(-) diff --git a/pep-0630.rst b/pep-0630.rst index db1428fe23f..8e6d001adf5 100644 --- a/pep-0630.rst +++ b/pep-0630.rst @@ -595,12 +595,3 @@ Copyright This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. - -.. - Local Variables: - mode: indented-text - indent-tabs-mode: nil - sentence-end-double-space: t - fill-column: 70 - coding: utf-8 - End: From 778e9b10e2c978293441fe29347fa8660d38deee Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Mon, 21 Mar 2022 18:20:07 -0500 Subject: [PATCH 2/8] PEP 630: Remove spurious heading level and conform heading breaks/chars --- pep-0630.rst | 75 ++++++++++++++++++++++++++++++---------------------- 1 file changed, 44 insertions(+), 31 deletions(-) diff --git a/pep-0630.rst b/pep-0630.rst index 8e6d001adf5..a98f594dfae 100644 --- a/pep-0630.rst +++ b/pep-0630.rst @@ -9,11 +9,8 @@ Created: 25-Aug-2020 Post-History: 16-Jul-2020 -Isolating Extension Modules -=========================== - Abstract --------- +======== Traditionally, state of Python extension modules was kept in C ``static`` variables, which have process-wide scope. This document @@ -25,8 +22,9 @@ possible. The switch involves allocating space for that state, potentially switching from static types to heap types, and—perhaps most importantly—accessing per-module state from code. + About this document -------------------- +=================== As an :pep:`informational PEP <1#pep-types>`, this document does not introduce any changes: those should be done in @@ -64,8 +62,9 @@ specific to CPython. As with any Informational PEP, this text does not necessarily represent a Python community consensus or recommendation. + Motivation ----------- +========== An *interpreter* is the context in which Python code runs. It contains configuration (e.g. the import path) and runtime state (e.g. the set of @@ -96,8 +95,9 @@ Unfortunately, *per-interpreter* state is not easy to achieve: extension authors tend to not keep multiple interpreters in mind when developing, and it is currently cumbersome to test the behavior. + Rationale for Per-module State ------------------------------- +============================== Instead of focusing on per-interpreter state, Python's C API is evolving to better support the more granular *per-module* state. By default, @@ -112,8 +112,9 @@ module object is created, and clean up when it's freed. In this regard, a module is just like any other ``PyObject *``; there are no “on interpreter shutdown” hooks to think about—or forget about. + Goal: Easy-to-use Module State -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +------------------------------ It is currently cumbersome or impossible to do everything the C API offers while keeping modules isolated. Enabled by :pep:`384`, changes in @@ -128,16 +129,18 @@ per-thread or per-task state. The goal is to treat these as exceptional cases: they should be possible, but extension authors will need to think more carefully about them. + Non-goals: Speedups and the GIL -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +------------------------------- There is some effort to speed up CPython on multi-core CPUs by making the GIL per-interpreter. While isolating interpreters helps that effort, defaulting to per-module state will be beneficial even if no speedup is achieved, as it makes supporting multiple interpreters safer by default. + How to make modules safe with multiple interpreters ---------------------------------------------------- +=================================================== There are many ways to correctly support multiple interpreters in extension modules. The rest of this text describes the preferred way to @@ -153,7 +156,7 @@ This section assumes that “*you*” are an extension module author. Isolated Module Objects -~~~~~~~~~~~~~~~~~~~~~~~ +----------------------- The key point to keep in mind when developing an extension module is that several module objects can be created from a single shared library. @@ -179,8 +182,9 @@ While some modules could do with less stringent restrictions, isolated modules make it easier to set clear expectations (and guidelines) that work across a variety of use cases. + Surprising Edge Cases -~~~~~~~~~~~~~~~~~~~~~ +--------------------- Note that isolated modules do create some surprising edge cases. Most notably, each module object will typically not share its classes and @@ -194,7 +198,7 @@ separate objects. In the following code, the exception is *not* caught:: ... old_binascii.unhexlify(b'qwertyuiop') ... except binascii.Error: ... print('boo') - ... + ... Traceback (most recent call last): File "", line 2, in binascii.Error: Non-hexadecimal digit found @@ -206,8 +210,9 @@ The goal is to make extension modules safe at the C level, not to make hacks behave intuitively. Mutating ``sys.modules`` “manually” counts as a hack. + Managing Global State -~~~~~~~~~~~~~~~~~~~~~ +--------------------- Sometimes, state of a Python module is not specific to that module, but to the entire process (or something else “more global” than a module). @@ -229,8 +234,9 @@ avoid issues with multiple interpreters is to explicitly prevent a module from being loaded more than once per process—see “Opt-Out: Limiting to One Module Object per Process” below. + Managing Per-Module State -~~~~~~~~~~~~~~~~~~~~~~~~~ +------------------------- To use per-module state, use `multi-phase extension module initialization `__ @@ -265,7 +271,7 @@ example module initialization shown at the bottom of the file. Opt-Out: Limiting to One Module Object per Process -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +-------------------------------------------------- A non-negative ``PyModuleDef.m_size`` signals that a module supports multiple interpreters correctly. If this is not yet the case for your @@ -286,8 +292,9 @@ process. For example:: // ... rest of initialization } + Module State Access from Functions -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +---------------------------------- Accessing the state from module-level functions is straightforward. Functions get the module object as their first argument; for extracting @@ -308,8 +315,9 @@ exception if there is no module state, i.e. ``PyModuleDef.m_size`` was zero. In your own module, you're in control of ``m_size``, so this is easy to prevent.) + Heap types ----------- +========== Traditionally, types defined in C code are *static*, that is, ``static PyTypeObject`` structures defined directly in code and @@ -347,7 +355,7 @@ or inherited slots). Always test the details that are important to you. Defining Heap Types -~~~~~~~~~~~~~~~~~~~ +------------------- Heap types can be created by filling a ``PyType_Spec`` structure, a description or “blueprint” of a class, and calling @@ -364,7 +372,7 @@ Python code). Garbage Collection Protocol -~~~~~~~~~~~~~~~~~~~~~~~~~~~ +--------------------------- Instances of heap types hold a reference to their type. This ensures that the type isn't destroyed before its instance, @@ -403,7 +411,7 @@ and ``tp_clear``. Module State Access from Classes -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +-------------------------------- If you have a type object defined with ``PyType_FromModuleAndSpec()``, you can call ``PyType_GetModule`` to get the associated module, then @@ -417,8 +425,9 @@ these two steps with ``PyType_GetModuleState``, resulting in:: return NULL; } + Module State Access from Regular Methods -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +---------------------------------------- Accessing the module-level state from methods of a class is somewhat more complicated, but possible thanks to changes introduced in :pep:`573`. @@ -445,7 +454,6 @@ that subclass, which may be defined in different module than yours. class Sub(Base): pass - For a method to get its “defining class”, it must use the ``METH_METHOD | METH_FASTCALL | METH_KEYWORDS`` `calling convention `__ @@ -488,8 +496,9 @@ For example:: {NULL}, } + Module State Access from Slot Methods, Getters and Setters -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +---------------------------------------------------------- .. note:: @@ -535,7 +544,7 @@ module. Lifetime of the Module State -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +---------------------------- When a module object is garbage-collected, its module state is freed. For each pointer to (a part of) the module state, you must hold a reference @@ -550,30 +559,33 @@ libraries. Open Issues ------------ +=========== Several issues around per-module state and heap types are still open. Discussions about improving the situation are best held on the `capi-sig mailing list `__. + Type Checking -~~~~~~~~~~~~~ +------------- Currently (as of Python 3.10), heap types have no good API to write ``Py*_Check`` functions (like ``PyUnicode_Check`` exists for ``str``, a static type), and so it is not easy to ensure whether instances have a particular C layout. + Metaclasses -~~~~~~~~~~~ +----------- Currently (as of Python 3.10), there is no good API to specify the *metaclass* of a heap type, that is, the ``ob_type`` field of the type object. + Per-Class scope -~~~~~~~~~~~~~~~ +--------------- It is also not possible to attach state to *types*. While ``PyHeapTypeObject`` is a variable-size object (``PyVarObject``), @@ -581,8 +593,9 @@ its variable-size storage is currently consumed by slots. Fixing this is complicated by the fact that several classes in an inheritance hierarchy may need to reserve some state. + Lossless conversion to heap types -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +--------------------------------- The heap type API was not designed for “lossless” conversion from static types, that is, creating a type that works exactly like a given static type. @@ -591,7 +604,7 @@ known “gotchas”. Copyright ---------- +========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. From aecdcfcd54b5b0bdbf3aa76f70c95b1baf70d36f Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Mon, 21 Mar 2022 18:22:42 -0500 Subject: [PATCH 3/8] PEP 630: Use standard ASCII quotes in source --- pep-0630.rst | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/pep-0630.rst b/pep-0630.rst index a98f594dfae..33c52a67772 100644 --- a/pep-0630.rst +++ b/pep-0630.rst @@ -75,13 +75,13 @@ two cases to think about—users may run interpreters: - in sequence, with several ``Py_InitializeEx``/``Py_FinalizeEx`` cycles, and -- in parallel, managing “sub-interpreters” using +- in parallel, managing "sub-interpreters" using ``Py_NewInterpreter``/``Py_EndInterpreter``. Both cases (and combinations of them) would be most useful when embedding Python within a library. Libraries generally shouldn't make assumptions about the application that uses them, which includes -assumptions about a process-wide “main Python interpreter”. +assumptions about a process-wide "main Python interpreter". Currently, CPython doesn't handle this use case well. Many extension modules (and even some stdlib modules) use *per-process* global state, @@ -109,8 +109,8 @@ extension can even be loaded in a single interpreter. Per-module state provides an easy way to think about lifetime and resource ownership: the extension module will initialize when a module object is created, and clean up when it's freed. In this regard, -a module is just like any other ``PyObject *``; there are no “on -interpreter shutdown” hooks to think about—or forget about. +a module is just like any other ``PyObject *``; there are no "on +interpreter shutdown" hooks to think about—or forget about. Goal: Easy-to-use Module State @@ -152,7 +152,7 @@ module needs might not yet be ready. A full example module is available as `xxlimited `__. -This section assumes that “*you*” are an extension module author. +This section assumes that "*you*" are an extension module author. Isolated Module Objects @@ -174,7 +174,7 @@ As a rule of thumb, the two modules should be completely independent. All objects and state specific to the module should be encapsulated within the module object, not shared with other module objects, and cleaned up when the module object is deallocated. Exceptions are -possible (see “Managing global state” below), but they will need more +possible (see "Managing global state" below), but they will need more thought and attention to edge cases than code that follows this rule of thumb. @@ -207,7 +207,7 @@ This is expected. Notice that pure-Python modules behave the same way: it is a part of how Python works. The goal is to make extension modules safe at the C level, not to make -hacks behave intuitively. Mutating ``sys.modules`` “manually” counts +hacks behave intuitively. Mutating ``sys.modules`` "manually" counts as a hack. @@ -215,7 +215,7 @@ Managing Global State --------------------- Sometimes, state of a Python module is not specific to that module, but -to the entire process (or something else “more global” than a module). +to the entire process (or something else "more global" than a module). For example: - The ``readline`` module manages *the* terminal. @@ -231,8 +231,8 @@ If that is not possible, consider explicit locking. If it is necessary to use process-global state, the simplest way to avoid issues with multiple interpreters is to explicitly prevent a -module from being loaded more than once per process—see “Opt-Out: -Limiting to One Module Object per Process” below. +module from being loaded more than once per process—see "Opt-Out: +Limiting to One Module Object per Process" below. Managing Per-Module State @@ -337,16 +337,16 @@ the Python level: for example, you can't set ``str.myattribute = 123``. on CPython's current, process-wide GIL. Because they are immutable and process-global, static types cannot access -“their” module state. +"their" module state. If any method of such a type requires access to module state, the type must be converted to a *heap-allocated type*, or *heap type* -for short. These correspond more closely to classes created by Python’s +for short. These correspond more closely to classes created by Python's ``class`` statement. For new modules, using heap types by default is a good rule of thumb. Static types can be converted to heap types, but note that -the heap type API was not designed for “lossless” conversion +the heap type API was not designed for "lossless" conversion from static types -- that is, creating a type that works exactly like a given static type. Unlike static types, heap type objects are mutable by default. Also, when rewriting the class definition in a new API, @@ -358,7 +358,7 @@ Defining Heap Types ------------------- Heap types can be created by filling a ``PyType_Spec`` structure, a -description or “blueprint” of a class, and calling +description or "blueprint" of a class, and calling ``PyType_FromModuleAndSpec()`` to construct a new class object. .. note:: @@ -435,7 +435,7 @@ To get the state, you need to first get the *defining class*, and then get the module state from it. The largest roadblock is getting *the class a method was defined in*, or -that method's “defining class” for short. The defining class can have a +that method's "defining class" for short. The defining class can have a reference to the module it is part of. Do not confuse the defining class with ``Py_TYPE(self)``. If the method @@ -454,7 +454,7 @@ that subclass, which may be defined in different module than yours. class Sub(Base): pass -For a method to get its “defining class”, it must use the +For a method to get its "defining class", it must use the ``METH_METHOD | METH_FASTCALL | METH_KEYWORDS`` `calling convention `__ and the corresponding `PyCMethod @@ -597,10 +597,10 @@ hierarchy may need to reserve some state. Lossless conversion to heap types --------------------------------- -The heap type API was not designed for “lossless” conversion from static types, +The heap type API was not designed for "lossless" conversion from static types, that is, creating a type that works exactly like a given static type. The best way to address it would probably be to write a guide that covers -known “gotchas”. +known "gotchas". Copyright From 9f5470137c86154e25be0e8af03b8977fe5e0c1f Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Mon, 21 Mar 2022 18:28:21 -0500 Subject: [PATCH 4/8] PEP 630: Use correct syntax highlighting declarations --- pep-0630.rst | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/pep-0630.rst b/pep-0630.rst index 33c52a67772..6155581aa3d 100644 --- a/pep-0630.rst +++ b/pep-0630.rst @@ -9,6 +9,8 @@ Created: 25-Aug-2020 Post-History: 16-Jul-2020 +.. highlight:: c + Abstract ======== @@ -160,7 +162,9 @@ Isolated Module Objects The key point to keep in mind when developing an extension module is that several module objects can be created from a single shared library. -For example:: +For example: + +.. code-block:: pycon >>> import sys >>> import binascii @@ -190,7 +194,9 @@ Note that isolated modules do create some surprising edge cases. Most notably, each module object will typically not share its classes and exceptions with other similar modules. Continuing from the example above, note that ``old_binascii.Error`` and ``binascii.Error`` are -separate objects. In the following code, the exception is *not* caught:: +separate objects. In the following code, the exception is *not* caught: + +.. code-block:: pycon >>> old_binascii.Error == binascii.Error False @@ -420,10 +426,10 @@ you can call ``PyType_GetModule`` to get the associated module, then To save a some tedious error-handling boilerplate code, you can combine these two steps with ``PyType_GetModuleState``, resulting in:: - my_struct *state = (my_struct*)PyType_GetModuleState(type); - if (state === NULL) { - return NULL; - } + my_struct *state = (my_struct*)PyType_GetModuleState(type); + if (state === NULL) { + return NULL; + } Module State Access from Regular Methods @@ -445,7 +451,9 @@ that subclass, which may be defined in different module than yours. .. note:: The following Python code can illustrate the concept. ``Base.get_defining_class`` returns ``Base`` even - if ``type(self) == Sub``:: + if ``type(self) == Sub``: + + .. code-block:: python class Base: def get_defining_class(self): From b2af7745aa2dc6cfe45d74661d8f38aa26a5b0d3 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Tue, 22 Mar 2022 20:23:32 -0500 Subject: [PATCH 5/8] PEP 630: Use consistant case within and between section titles --- pep-0630.rst | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/pep-0630.rst b/pep-0630.rst index 6155581aa3d..4001c4a88f5 100644 --- a/pep-0630.rst +++ b/pep-0630.rst @@ -25,7 +25,7 @@ switching from static types to heap types, and—perhaps most importantly—accessing per-module state from code. -About this document +About This Document =================== As an :pep:`informational PEP <1#pep-types>`, @@ -115,7 +115,7 @@ a module is just like any other ``PyObject *``; there are no "on interpreter shutdown" hooks to think about—or forget about. -Goal: Easy-to-use Module State +Goal: Easy-to-Use Module State ------------------------------ It is currently cumbersome or impossible to do everything the C API @@ -141,8 +141,8 @@ defaulting to per-module state will be beneficial even if no speedup is achieved, as it makes supporting multiple interpreters safer by default. -How to make modules safe with multiple interpreters -=================================================== +Making Modules Safe with Multiple Interpreters +============================================== There are many ways to correctly support multiple interpreters in extension modules. The rest of this text describes the preferred way to @@ -322,7 +322,7 @@ zero. In your own module, you're in control of ``m_size``, so this is easy to prevent.) -Heap types +Heap Types ========== Traditionally, types defined in C code are *static*, that is, @@ -592,7 +592,7 @@ Currently (as of Python 3.10), there is no good API to specify the object. -Per-Class scope +Per-Class Scope --------------- It is also not possible to attach state to *types*. While @@ -602,7 +602,7 @@ is complicated by the fact that several classes in an inheritance hierarchy may need to reserve some state. -Lossless conversion to heap types +Lossless Conversion to Heap Types --------------------------------- The heap type API was not designed for "lossless" conversion from static types, From b154e38fb1e742b27fd2f41e80a57ac7cc9e291b Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Tue, 22 Mar 2022 22:16:37 -0500 Subject: [PATCH 6/8] PEP 630: Add proper link in Discusses-To header --- pep-0630.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pep-0630.rst b/pep-0630.rst index 4001c4a88f5..db4abaace51 100644 --- a/pep-0630.rst +++ b/pep-0630.rst @@ -1,7 +1,7 @@ PEP: 630 Title: Isolating Extension Modules Author: Petr Viktorin -Discussions-To: capi-sig@python.org +Discussions-To: https://mail.python.org/archives/list/capi-sig@python.org/ Status: Active Type: Informational Content-Type: text/x-rst From 8bb59008860bc76c3638881ac98f727df34eec73 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Tue, 22 Mar 2022 22:18:50 -0500 Subject: [PATCH 7/8] PEP 630: Copyedit the text of the PEP and add a few clarifying additions --- pep-0630.rst | 138 ++++++++++++++++++++++++++++----------------------- 1 file changed, 76 insertions(+), 62 deletions(-) diff --git a/pep-0630.rst b/pep-0630.rst index db4abaace51..d6ac838ac18 100644 --- a/pep-0630.rst +++ b/pep-0630.rst @@ -14,13 +14,13 @@ Post-History: 16-Jul-2020 Abstract ======== -Traditionally, state of Python extension modules was kept in C +Traditionally, state belonging to Python extension modules was kept in C ``static`` variables, which have process-wide scope. This document describes problems of such per-process state and efforts to make -per-module state, a better default, possible and easy to use. +per-module state—a better default—possible and easy to use. The document also describes how to switch to per-module state where -possible. The switch involves allocating space for that state, potentially +possible. This transition involves allocating space for that state, potentially switching from static types to heap types, and—perhaps most importantly—accessing per-module state from code. @@ -29,15 +29,16 @@ About This Document =================== As an :pep:`informational PEP <1#pep-types>`, -this document does not introduce any changes: those should be done in +this document does not introduce any changes; those should be done in their own PEPs (or issues, if small enough). Rather, it covers the motivation behind an effort that spans multiple releases, and instructs early adopters on how to use the finished features. -Once support is reasonably complete, the text can be moved to Python's -documentation as a HOWTO. Meanwhile, in the spirit of documentation-driven -development, gaps identified in this text can show where to focus -the effort, and the text can be updated as new features are implemented +Once support is reasonably complete, this content can be moved to Python's +documentation as a `HOWTO `__. +Meanwhile, in the spirit of documentation-driven development, +gaps identified in this PEP can show where to focus the effort, +and it can be updated as new features are implemented. Whenever this PEP mentions *extension modules*, the advice also applies to *built-in* modules. @@ -52,7 +53,7 @@ applies to *built-in* modules. PEPs related to this effort are: -- :pep:`384` -- *Defining a Stable ABI*, which added C API for creating +- :pep:`384` -- *Defining a Stable ABI*, which added a C API for creating heap types - :pep:`489` -- *Multi-phase extension module initialization* - :pep:`573` -- *Module State Access from C Extension Methods* @@ -83,7 +84,7 @@ two cases to think about—users may run interpreters: Both cases (and combinations of them) would be most useful when embedding Python within a library. Libraries generally shouldn't make assumptions about the application that uses them, which includes -assumptions about a process-wide "main Python interpreter". +assuming a process-wide "main Python interpreter". Currently, CPython doesn't handle this use case well. Many extension modules (and even some stdlib modules) use *per-process* global state, @@ -91,9 +92,9 @@ because C ``static`` variables are extremely easy to use. Thus, data that should be specific to an interpreter ends up being shared between interpreters. Unless the extension developer is careful, it is very easy to introduce edge cases that lead to crashes when a module is loaded in -more than one interpreter. +more than one interpreter in the same process. -Unfortunately, *per-interpreter* state is not easy to achieve: extension +Unfortunately, *per-interpreter* state is not easy to achieve—extension authors tend to not keep multiple interpreters in mind when developing, and it is currently cumbersome to test the behavior. @@ -104,7 +105,7 @@ Rationale for Per-module State Instead of focusing on per-interpreter state, Python's C API is evolving to better support the more granular *per-module* state. By default, C-level data will be attached to a *module object*. Each interpreter -will then create its own module object, keeping data separate. For +will then create its own module object, keeping the data separate. For testing the isolation, multiple module objects corresponding to a single extension can even be loaded in a single interpreter. @@ -112,7 +113,7 @@ Per-module state provides an easy way to think about lifetime and resource ownership: the extension module will initialize when a module object is created, and clean up when it's freed. In this regard, a module is just like any other ``PyObject *``; there are no "on -interpreter shutdown" hooks to think about—or forget about. +interpreter shutdown" hooks to think—or forget—about. Goal: Easy-to-Use Module State @@ -120,7 +121,7 @@ Goal: Easy-to-Use Module State It is currently cumbersome or impossible to do everything the C API offers while keeping modules isolated. Enabled by :pep:`384`, changes in -PEPs 489 and 573 (and future planned ones) aim to first make it +:pep:`489` and :pep:`573` (and future planned ones) aim to first make it *possible* to build modules this way, and then to make it *easy* to write new modules this way and to convert old ones, so that it can become a natural default. @@ -146,7 +147,7 @@ Making Modules Safe with Multiple Interpreters There are many ways to correctly support multiple interpreters in extension modules. The rest of this text describes the preferred way to -write such a module, or to convert an existing module. +write such a module, or to convert an existing one. Note that support is a work in progress; the API for some features your module needs might not yet be ready. @@ -178,7 +179,7 @@ As a rule of thumb, the two modules should be completely independent. All objects and state specific to the module should be encapsulated within the module object, not shared with other module objects, and cleaned up when the module object is deallocated. Exceptions are -possible (see "Managing global state" below), but they will need more +possible (see `Managing Global State`_), but they will need more thought and attention to edge cases than code that follows this rule of thumb. @@ -192,8 +193,9 @@ Surprising Edge Cases Note that isolated modules do create some surprising edge cases. Most notably, each module object will typically not share its classes and -exceptions with other similar modules. Continuing from the example -above, note that ``old_binascii.Error`` and ``binascii.Error`` are +exceptions with other similar modules. Continuing from the +`example above `__, +note that ``old_binascii.Error`` and ``binascii.Error`` are separate objects. In the following code, the exception is *not* caught: .. code-block:: pycon @@ -237,15 +239,15 @@ If that is not possible, consider explicit locking. If it is necessary to use process-global state, the simplest way to avoid issues with multiple interpreters is to explicitly prevent a -module from being loaded more than once per process—see "Opt-Out: -Limiting to One Module Object per Process" below. +module from being loaded more than once per process—see +`Opt-Out: Limiting to One Module Object per Process`_. Managing Per-Module State ------------------------- -To use per-module state, use `multi-phase extension module -initialization `__ +To use per-module state, use `multi-phase extension module initialization +`__ introduced in :pep:`489`. This signals that your module supports multiple interpreters correctly. @@ -254,8 +256,8 @@ bytes of storage local to the module. Usually, this will be set to the size of some module-specific ``struct``, which can store all of the module's C-level state. In particular, it is where you should put pointers to classes (including exceptions, but excluding static types) -and settings (e.g. ``csv``'s -`field_size_limit `__) +and settings (e.g. ``csv``'s `field_size_limit +`__) which the C code needs to function. .. note:: @@ -265,9 +267,9 @@ which the C code needs to function. which is easy to get wrong and hard to test sufficiently. If the module state includes ``PyObject`` pointers, the module object -must hold references to those objects and implement module-level hooks -``m_traverse``, ``m_clear``, ``m_free``. These work like -``tp_traverse``, ``tp_clear``, ``tp_free`` of a class. Adding them will +must hold references to those objects and implement the module-level hooks +``m_traverse``, ``m_clear`` and ``m_free``. These work like +``tp_traverse``, ``tp_clear`` and ``tp_free`` of a class. Adding them will require some work and make the code longer; this is the price for modules which can be unloaded cleanly. @@ -304,7 +306,7 @@ Module State Access from Functions Accessing the state from module-level functions is straightforward. Functions get the module object as their first argument; for extracting -the state there is ``PyModule_GetState``:: +the state, you can use ``PyModule_GetState``:: static PyObject * func(PyObject *module, PyObject *args) @@ -316,16 +318,17 @@ the state there is ``PyModule_GetState``:: // ... rest of logic } -(Note that ``PyModule_GetState`` may return NULL without setting an -exception if there is no module state, i.e. ``PyModuleDef.m_size`` was -zero. In your own module, you're in control of ``m_size``, so this is -easy to prevent.) +.. note:: + ``PyModule_GetState`` may return NULL without setting an + exception if there is no module state, i.e. ``PyModuleDef.m_size`` was + zero. In your own module, you're in control of ``m_size``, so this is + easy to prevent. Heap Types ========== -Traditionally, types defined in C code are *static*, that is, +Traditionally, types defined in C code are *static*; that is, ``static PyTypeObject`` structures defined directly in code and initialized using ``PyType_Ready()``. @@ -336,11 +339,15 @@ the Python level: for example, you can't set ``str.myattribute = 123``. .. note:: Sharing truly immutable objects between interpreters is fine, - as long as they don't provide access to mutable objects. But, every - Python object has a mutable implementation detail: the reference - count. Changes to the refcount are guarded by the GIL. Thus, code - that shares any Python objects across interpreters implicitly depends - on CPython's current, process-wide GIL. + as long as they don't provide access to mutable objects. + However, in CPython, every Python object has a mutable implementation + detail: the reference count. Changes to the refcount are guarded by the GIL. + Thus, code that shares any Python objects across interpreters implicitly + depends on CPython's current, process-wide GIL. + + :pep:`683` proposes a mechanism to mark certain objects as + immortal, such that they will be truly immutable, though is is likewise + considered an internal implementation detail. Because they are immutable and process-global, static types cannot access "their" module state. @@ -381,7 +388,7 @@ Garbage Collection Protocol --------------------------- Instances of heap types hold a reference to their type. -This ensures that the type isn't destroyed before its instance, +This ensures that the type isn't destroyed before all its instances are, but may result in reference cycles that need to be broken by the garbage collector. @@ -389,14 +396,18 @@ To avoid memory leaks, instances of heap types must implement the garbage collection protocol. That is, heap types should: -- Have the ``Py_TPFLAGS_HAVE_GC`` flag, +- Have the ``Py_TPFLAGS_HAVE_GC`` flag. - Define a traverse function using ``Py_tp_traverse``, which visits the type (e.g. using ``Py_VISIT(Py_TYPE(self));``). -Please refer to the documentation of ``Py_TPFLAGS_HAVE_GC`` and -``tp_traverse`` for additional considerations. +Please refer to the `documentation +`__ of `Py_TPFLAGS_HAVE_GC +`__ and +`tp_traverse +` +for additional considerations. -If your traverse function delegates to ``tp_traverse`` of its base class +If your traverse function delegates to the ``tp_traverse`` of its base class (or another type), ensure that ``Py_TYPE(self)`` is visited only once. Note that only heap type are expected to visit the type in ``tp_traverse``. @@ -420,7 +431,7 @@ Module State Access from Classes -------------------------------- If you have a type object defined with ``PyType_FromModuleAndSpec()``, -you can call ``PyType_GetModule`` to get the associated module, then +you can call ``PyType_GetModule`` to get the associated module, and then ``PyModule_GetState`` to get the module's state. To save a some tedious error-handling boilerplate code, you can combine @@ -435,8 +446,8 @@ these two steps with ``PyType_GetModuleState``, resulting in:: Module State Access from Regular Methods ---------------------------------------- -Accessing the module-level state from methods of a class is somewhat -more complicated, but possible thanks to changes introduced in :pep:`573`. +Accessing the module-level state from methods of a class is somewhat more +complicated, but is possible thanks to the changes introduced in :pep:`573`. To get the state, you need to first get the *defining class*, and then get the module state from it. @@ -463,10 +474,10 @@ that subclass, which may be defined in different module than yours. pass For a method to get its "defining class", it must use the -``METH_METHOD | METH_FASTCALL | METH_KEYWORDS`` `calling -convention `__ -and the corresponding `PyCMethod -signature `__:: +``METH_METHOD | METH_FASTCALL | METH_KEYWORDS`` `calling convention +`__ +and the corresponding `PyCMethod signature +`__:: PyObject *PyCMethod( PyObject *self, // object the method was called on @@ -518,18 +529,20 @@ Module State Access from Slot Methods, Getters and Setters you must update ``Py_LIMITED_API`` to ``0x030b0000``, losing ABI compatibility with earlier versions. -Slot methods -- the fast C equivalents for special methods, such as -`nb_add `__ -for ``__add__`` or `tp_new `__ +Slot methods -- the fast C equivalents for special methods, such as `nb_add +`__ +for ``__add__`` or `tp_new +`__ for initialization -- have a very simple API that doesn't allow -passing in the defining class as in ``PyCMethod``. +passing in the defining class, as is the case in ``PyCMethod``. The same goes for getters and setters defined with `PyGetSetDef `__. -To access the module state in these cases, use the -`PyType_GetModuleByDef `__ +To access the module state in these cases, use the `PyType_GetModuleByDef +`__ function, and pass in the module definition. -Once you have the module, call `PyModule_GetState `__ +Once you have the module, call `PyModule_GetState +`__ to get the state:: PyObject *module = PyType_GetModuleByDef(Py_TYPE(self), &module_def); @@ -538,7 +551,8 @@ to get the state:: return NULL; } -``PyType_GetModuleByDef`` works by searching the `MRO `__ +``PyType_GetModuleByDef`` works by searching the `MRO +`__ (i.e. all superclasses) for the first superclass that has a corresponding module. @@ -580,7 +594,7 @@ Type Checking Currently (as of Python 3.10), heap types have no good API to write ``Py*_Check`` functions (like ``PyUnicode_Check`` exists for ``str``, a -static type), and so it is not easy to ensure whether instances have a +static type), and so it is not easy to ensure that instances have a particular C layout. @@ -588,7 +602,7 @@ Metaclasses ----------- Currently (as of Python 3.10), there is no good API to specify the -*metaclass* of a heap type, that is, the ``ob_type`` field of the type +*metaclass* of a heap type; that is, the ``ob_type`` field of the type object. @@ -605,7 +619,7 @@ hierarchy may need to reserve some state. Lossless Conversion to Heap Types --------------------------------- -The heap type API was not designed for "lossless" conversion from static types, +The heap type API was not designed for "lossless" conversion from static types; that is, creating a type that works exactly like a given static type. The best way to address it would probably be to write a guide that covers known "gotchas". From 637448e9facf671c67c1e13ca7126350e0798eb5 Mon Sep 17 00:00:00 2001 From: CAM Gerlach Date: Fri, 25 Mar 2022 14:45:42 -0500 Subject: [PATCH 8/8] PEP 630: Further revise and update to reflect reviewer input --- pep-0630.rst | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/pep-0630.rst b/pep-0630.rst index d6ac838ac18..267594775c2 100644 --- a/pep-0630.rst +++ b/pep-0630.rst @@ -1,7 +1,7 @@ PEP: 630 Title: Isolating Extension Modules Author: Petr Viktorin -Discussions-To: https://mail.python.org/archives/list/capi-sig@python.org/ +Discussions-To: capi-sig@python.org Status: Active Type: Informational Content-Type: text/x-rst @@ -345,10 +345,6 @@ the Python level: for example, you can't set ``str.myattribute = 123``. Thus, code that shares any Python objects across interpreters implicitly depends on CPython's current, process-wide GIL. - :pep:`683` proposes a mechanism to mark certain objects as - immortal, such that they will be truly immutable, though is is likewise - considered an internal implementation detail. - Because they are immutable and process-global, static types cannot access "their" module state. If any method of such a type requires access to module state, @@ -534,7 +530,7 @@ Slot methods -- the fast C equivalents for special methods, such as `nb_add for ``__add__`` or `tp_new `__ for initialization -- have a very simple API that doesn't allow -passing in the defining class, as is the case in ``PyCMethod``. +passing in the defining class, unlike with ``PyCMethod``. The same goes for getters and setters defined with `PyGetSetDef `__.