Skip to content

Commit f6476e8

Browse files
committed
cancel some edits
1 parent 8244e98 commit f6476e8

File tree

7 files changed

+109
-110
lines changed

7 files changed

+109
-110
lines changed

InternalDocs/changing_grammar.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Below is a checklist of things that may need to change.
1818
to regenerate [`Parser/parser.c`](../Parser/parser.c).
1919
(This runs Python's parser generator, [`Tools/peg_generator`](../Tools/peg_generator)).
2020

21-
* [`Grammar/Tokens`](../Grammar/Tokens) is a place for adding new token types. After
21+
* [`Grammar/Tokens`](../Grammar/Tokens) is a place for adding new token types. After
2222
changing it, run ``make regen-token`` to regenerate
2323
[`Include/internal/pycore_token.h`](../Include/internal/pycore_token.h),
2424
[`Parser/token.c`](../Parser/token.c), [`Lib/token.py`](../Lib/token.py)

InternalDocs/compiler.md

Lines changed: 58 additions & 59 deletions
Large diffs are not rendered by default.

InternalDocs/exception_handling.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -190,5 +190,5 @@ Exception Chaining Implementation
190190
refers to setting the `__context__` and `__cause__` fields of an exception as it is
191191
being raised. The `__context__` field is set by `_PyErr_SetObject()` in
192192
[Python/errors.c](../Python/errors.c) (which is ultimately called by all
193-
`PyErr_Set*()` functions). The `__cause__` field (explicit chaining) is set by
193+
`PyErr_Set*()` functions). The `__cause__` field (explicit chaining) is set by
194194
the `RAISE_VARARGS` bytecode.

InternalDocs/frames.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -125,10 +125,10 @@ the next instruction to be executed. During a call to a python function,
125125
to see in an exception traceback.
126126

127127
The `return_offset` field determines where a `RETURN` should go in the caller,
128-
relative to `instr_ptr`. It is only meaningful to the callee, so it needs to
128+
relative to `instr_ptr`. It is only meaningful to the callee, so it needs to
129129
be set in any instruction that implements a call (to a Python function),
130130
including CALL, SEND and BINARY_SUBSCR_GETITEM, among others. If there is no
131-
callee, then return_offset is meaningless. It is necessary to have a separate
131+
callee, then return_offset is meaningless. It is necessary to have a separate
132132
field for the return offset because (1) if we apply this offset to `instr_ptr`
133133
while executing the `RETURN`, this is too early and would lose us information
134134
about the previous instruction which we could need for introspecting and

InternalDocs/garbage_collector.md

Lines changed: 28 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ Starting in version 3.13, CPython contains two GC implementations:
5555
performing a collection for thread safety.
5656

5757
Both implementations use the same basic algorithms, but operate on different
58-
data structures. See the section on
58+
data structures. See the section on
5959
[Differences between GC implementations](#Differences-between-GC-implementations)
6060
for the details.
6161

@@ -64,7 +64,7 @@ Memory layout and object structure
6464
==================================
6565

6666
The garbage collector requires additional fields in Python objects to support
67-
garbage collection. These extra fields are different in the default and the
67+
garbage collection. These extra fields are different in the default and the
6868
free-threaded builds.
6969

7070

@@ -111,11 +111,11 @@ that in the [Optimization: incremental collection](#Optimization-incremental-col
111111
they are also reused to fulfill other purposes when the full doubly linked list
112112
structure is not needed as a memory optimization.
113113

114-
Doubly linked lists are used because they efficiently support the most frequently required operations. In
114+
Doubly linked lists are used because they efficiently support the most frequently required operations. In
115115
general, the collection of all objects tracked by GC is partitioned into disjoint sets, each in its own
116-
doubly linked list. Between collections, objects are partitioned into "generations", reflecting how
117-
often they've survived collection attempts. During collections, the generation(s) being collected
118-
are further partitioned into, for example, sets of reachable and unreachable objects. Doubly linked lists
116+
doubly linked list. Between collections, objects are partitioned into "generations", reflecting how
117+
often they've survived collection attempts. During collections, the generation(s) being collected
118+
are further partitioned into, for example, sets of reachable and unreachable objects. Doubly linked lists
119119
support moving an object from one partition to another, adding a new object, removing an object
120120
entirely (objects tracked by GC are most often reclaimed by the refcounting system when GC
121121
isn't running at all!), and merging partitions, all with a small constant number of pointer updates.
@@ -128,7 +128,7 @@ GC for the free-threaded build
128128
In the free-threaded build, Python objects contain a 1-byte field
129129
`ob_gc_bits` that is used to track garbage collection related state. The
130130
field exists in all objects, including ones that do not support cyclic
131-
garbage collection. The field is used to identify objects that are tracked
131+
garbage collection. The field is used to identify objects that are tracked
132132
by the collector, ensure that finalizers are called only once per object,
133133
and, during garbage collection, differentiate reachable vs. unreachable objects.
134134

@@ -192,7 +192,7 @@ the interpreter create cycles everywhere. Some notable examples:
192192
have internal links to themselves.
193193

194194
To correctly dispose of these objects once they become unreachable, they need
195-
to be identified first. To understand how the algorithm works, let’s take
195+
to be identified first. To understand how the algorithm works, let’s take
196196
the case of a circular linked list which has one link referenced by a
197197
variable `A`, and one self-referencing object which is completely
198198
unreachable:
@@ -220,15 +220,15 @@ unreachable:
220220
2
221221
```
222222

223-
The GC starts with a set of candidate objects it wants to scan. In the
223+
The GC starts with a set of candidate objects it wants to scan. In the
224224
default build, these "objects to scan" might be all container objects or a
225-
smaller subset (or "generation"). In the free-threaded build, the collector
225+
smaller subset (or "generation"). In the free-threaded build, the collector
226226
always scans all container objects.
227227

228-
The objective is to identify all the unreachable objects. The collector does
228+
The objective is to identify all the unreachable objects. The collector does
229229
this by identifying reachable objects; the remaining objects must be
230-
unreachable. The first step is to identify all of the "to scan" objects that
231-
are **directly** reachable from outside the set of candidate objects. These
230+
unreachable. The first step is to identify all of the "to scan" objects that
231+
are **directly** reachable from outside the set of candidate objects. These
232232
objects have a refcount larger than the number of incoming references from
233233
within the candidate set.
234234

@@ -241,7 +241,7 @@ interpreter will not modify the real reference count field.
241241
![gc-image1](images/python-cyclic-gc-1-new-page.png)
242242

243243
The GC then iterates over all containers in the first list and decrements by one the
244-
`gc_ref` field of any other object that container is referencing. Doing
244+
`gc_ref` field of any other object that container is referencing. Doing
245245
this makes use of the `tp_traverse` slot in the container class (implemented
246246
using the C API or inherited by a superclass) to know what objects are referenced by
247247
each container. After all the objects have been scanned, only the objects that have
@@ -273,7 +273,7 @@ When the GC encounters an object which is reachable (`gc_ref > 0`), it traverses
273273
its references using the `tp_traverse` slot to find all the objects that are
274274
reachable from it, moving them to the end of the list of reachable objects (where
275275
they started originally) and setting its `gc_ref` field to 1. This is what happens
276-
to `link_2` and `link_3` below as they are reachable from `link_1`. From the
276+
to `link_2` and `link_3` below as they are reachable from `link_1`. From the
277277
state in the previous image and after examining the objects referred to by `link_1`
278278
the GC knows that `link_3` is reachable after all, so it is moved back to the
279279
original list and its `gc_ref` field is set to 1 so that if the GC visits it again,
@@ -293,7 +293,7 @@ list are really unreachable and can thus be garbage collected.
293293

294294
Pragmatically, it's important to note that no recursion is required by any of this,
295295
and neither does it in any other way require additional memory proportional to the
296-
number of objects, number of pointers, or the lengths of pointer chains. Apart from
296+
number of objects, number of pointers, or the lengths of pointer chains. Apart from
297297
`O(1)` storage for internal C needs, the objects themselves contain all the storage
298298
the GC algorithms require.
299299

@@ -317,8 +317,8 @@ list.
317317
So instead of not moving at all, the reachable objects B and A are each moved twice.
318318
Why is this a win? A straightforward algorithm to move the reachable objects instead
319319
would move A, B, and C once each. The key is that this dance leaves the objects in
320-
order C, B, A - it's reversed from the original order. On all *subsequent* scans,
321-
none of them will move. Since most objects aren't in cycles, this can save an
320+
order C, B, A - it's reversed from the original order. On all *subsequent* scans,
321+
none of them will move. Since most objects aren't in cycles, this can save an
322322
unbounded number of moves across an unbounded number of later collections. The only
323323
time the cost can be higher is the first time the chain is scanned.
324324

@@ -331,7 +331,7 @@ follows these steps in order:
331331

332332
1. Handle and clear weak references (if any). Weak references to unreachable objects
333333
are set to `None`. If the weak reference has an associated callback, the callback
334-
is enqueued to be called once the clearing of weak references is finished. We only
334+
is enqueued to be called once the clearing of weak references is finished. We only
335335
invoke callbacks for weak references that are themselves reachable. If both the weak
336336
reference and the pointed-to object are unreachable we do not execute the callback.
337337
This is partly for historical reasons: the callback could resurrect an unreachable
@@ -490,7 +490,7 @@ to the size of the data, often a word or multiple thereof. This discrepancy
490490
leaves a few of the least significant bits of the pointer unused, which can be
491491
used for tags or to keep other information – most often as a bit field (each
492492
bit a separate tag) – as long as code that uses the pointer masks out these
493-
bits before accessing memory. For example, on a 32-bit architecture (for both
493+
bits before accessing memory. For example, on a 32-bit architecture (for both
494494
addresses and word size), a word is 32 bits = 4 bytes, so word-aligned
495495
addresses are always a multiple of 4, hence end in `00`, leaving the last 2 bits
496496
available; while on a 64-bit architecture, a word is 64 bits = 8 bytes, so
@@ -519,10 +519,10 @@ of `PyGC_Head` discussed in the `Memory layout and object structure`_ section:
519519
- The `_gc_next` field is used as the "next" pointer to maintain the doubly linked
520520
list but during collection its lowest bit is used to keep the
521521
`NEXT_MASK_UNREACHABLE` flag that indicates if an object is tentatively
522-
unreachable during the cycle detection algorithm. This is a drawback to using only
522+
unreachable during the cycle detection algorithm. This is a drawback to using only
523523
doubly linked lists to implement partitions: while most needed operations are
524524
constant-time, there is no efficient way to determine which partition an object is
525-
currently in. Instead, when that's needed, ad hoc tricks (like the
525+
currently in. Instead, when that's needed, ad hoc tricks (like the
526526
`NEXT_MASK_UNREACHABLE` flag) are employed.
527527

528528
Optimization: delayed untracking containers
@@ -581,29 +581,29 @@ structure, while the free-threaded build implementation does not use that
581581
data structure.
582582

583583
- The default build implementation stores all tracked objects in a doubly
584-
linked list using `PyGC_Head`. The free-threaded build implementation
584+
linked list using `PyGC_Head`. The free-threaded build implementation
585585
instead relies on the embedded mimalloc memory allocator to scan the heap
586586
for tracked objects.
587587
- The default build implementation uses `PyGC_Head` for the unreachable
588-
object list. The free-threaded build implementation repurposes the
588+
object list. The free-threaded build implementation repurposes the
589589
`ob_tid` field to store a unreachable objects linked list.
590590
- The default build implementation stores flags in the `_gc_prev` field of
591-
`PyGC_Head`. The free-threaded build implementation stores these flags
591+
`PyGC_Head`. The free-threaded build implementation stores these flags
592592
in `ob_gc_bits`.
593593

594594

595595
The default build implementation relies on the
596596
[global interpreter lock](https://docs.python.org/3/glossary.html#term-global-interpreter-lock)
597-
for thread safety. The free-threaded build implementation has two "stop the
597+
for thread safety. The free-threaded build implementation has two "stop the
598598
world" pauses, in which all other executing threads are temporarily paused so
599599
that the GC can safely access reference counts and object attributes.
600600

601-
The default build implementation is a generational collector. The
601+
The default build implementation is a generational collector. The
602602
free-threaded build is non-generational; each collection scans the entire
603603
heap.
604604

605605
- Keeping track of object generations is simple and inexpensive in the default
606-
build. The free-threaded build relies on mimalloc for finding tracked
606+
build. The free-threaded build relies on mimalloc for finding tracked
607607
objects; identifying "young" objects without scanning the entire heap would
608608
be more difficult.
609609

InternalDocs/interpreter.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ It also has a reference to the `CodeObject` itself.
3131
In addition to the frame, `_PyEval_EvalFrame()` also receives a
3232
[`Thread State`](https://docs.python.org/3/c-api/init.html#c.PyThreadState)
3333
object, `tstate`, which includes things like the exception state and the
34-
recursion depth. The thread state also provides access to the per-interpreter
34+
recursion depth. The thread state also provides access to the per-interpreter
3535
state (`tstate->interp`), which has a pointer to the per-runtime (that is,
3636
truly global) state (`tstate->interp->runtime`).
3737

@@ -130,15 +130,15 @@ The size of the inline cache for a particular instruction is fixed by its `opcod
130130
Moreover, the inline cache size for all instructions in a
131131
[family of specialized/specializable instructions](adaptive.md)
132132
(for example, `LOAD_ATTR`, `LOAD_ATTR_SLOT`, `LOAD_ATTR_MODULE`) must all be
133-
the same. Cache entries are reserved by the compiler and initialized with zeros.
133+
the same. Cache entries are reserved by the compiler and initialized with zeros.
134134
Although they are represented by code units, cache entries do not conform to the
135135
`opcode` / `oparg` format.
136136

137137
If an instruction has an inline cache, the layout of its cache is described by
138138
a `struct` definition in (`pycore_code.h`)[../Include/internal/pycore_code.h].
139139
This allows us to access the cache by casting `next_instr` to a pointer to this `struct`.
140140
The size of such a `struct` must be independent of the machine architecture, word size
141-
and alignment requirements. For a 32-bit field, the `struct` should use `_Py_CODEUNIT field[2]`.
141+
and alignment requirements. For a 32-bit field, the `struct` should use `_Py_CODEUNIT field[2]`.
142142

143143
The instruction implementation is responsible for advancing `next_instr` past the inline cache.
144144
For example, if an instruction's inline cache is four bytes (that is, two code units) in size,
@@ -210,12 +210,12 @@ In 3.10 and before, this was the case even when a Python function called
210210
another Python function:
211211
The `CALL` opcode would call the `tp_call` dispatch function of the
212212
callee, which would extract the code object, create a new frame for the call
213-
stack, and then call back into the interpreter. This approach is very general
213+
stack, and then call back into the interpreter. This approach is very general
214214
but consumes several C stack frames for each nested Python call, thereby
215215
increasing the risk of an (unrecoverable) C stack overflow.
216216

217217
Since 3.11, the `CALL` instruction special-cases function objects to "inline"
218-
the call. When a call gets inlined, a new frame gets pushed onto the call
218+
the call. When a call gets inlined, a new frame gets pushed onto the call
219219
stack and the interpreter "jumps" to the start of the callee's bytecode.
220220
When an inlined callee executes a `RETURN_VALUE` instruction, the frame is
221221
popped off the call stack and the interpreter returns to its caller,
@@ -248,12 +248,12 @@ In this case we allocate a proper `PyFrameObject` and initialize it from the
248248

249249
Things get more complicated when generators are involved, since those do not
250250
follow the push/pop model. This includes async functions, which are based on
251-
the same mechanism. A generator object has space for a `_PyInterpreterFrame`
251+
the same mechanism. A generator object has space for a `_PyInterpreterFrame`
252252
structure, including the variable-size part (used for locals and the eval stack).
253253
When a generator (or async) function is first called, a special opcode
254254
`RETURN_GENERATOR` is executed, which is responsible for creating the
255-
generator object. The generator object's `_PyInterpreterFrame` is initialized
256-
with a copy of the current stack frame. The current stack frame is then popped
255+
generator object. The generator object's `_PyInterpreterFrame` is initialized
256+
with a copy of the current stack frame. The current stack frame is then popped
257257
off the frame stack and the generator object is returned.
258258
(Details differ depending on the `is_entry` flag.)
259259
When the generator is resumed, the interpreter pushes its `_PyInterpreterFrame`
@@ -317,9 +317,9 @@ With a new bytecode you must also change what is called the "magic number" for
317317
.pyc files: bump the value of the variable `MAGIC_NUMBER` in
318318
[`Lib/importlib/_bootstrap_external.py`](../Lib/importlib/_bootstrap_external.py).
319319
Changing this number will lead to all .pyc files with the old `MAGIC_NUMBER`
320-
to be recompiled by the interpreter on import. Whenever `MAGIC_NUMBER` is
320+
to be recompiled by the interpreter on import. Whenever `MAGIC_NUMBER` is
321321
changed, the ranges in the `magic_values` array in
322-
[`PC/launcher.c`](../PC/launcher.c) may also need to be updated. Changes to
322+
[`PC/launcher.c`](../PC/launcher.c) may also need to be updated. Changes to
323323
[`Lib/importlib/_bootstrap_external.py`](../Lib/importlib/_bootstrap_external.py)
324324
will take effect only after running `make regen-importlib`.
325325

@@ -333,12 +333,12 @@ will take effect only after running `make regen-importlib`.
333333
> On Windows, running the `./build.bat` script will automatically
334334
> regenerate the required files without requiring additional arguments.
335335
336-
Finally, you need to introduce the use of the new bytecode. Update
336+
Finally, you need to introduce the use of the new bytecode. Update
337337
[`Python/codegen.c`](../Python/codegen.c) to emit code with this bytecode.
338338
Optimizations in [`Python/flowgraph.c`](../Python/flowgraph.c) may also
339-
need to be updated. If the new opcode affects a control flow or the block
339+
need to be updated. If the new opcode affects a control flow or the block
340340
stack, you may have to update the `frame_setlineno()` function in
341-
[`Objects/frameobject.c`](../Objects/frameobject.c). It may also be necessary
341+
[`Objects/frameobject.c`](../Objects/frameobject.c). It may also be necessary
342342
to update [`Lib/dis.py`](../Lib/dis.py) if the new opcode interprets its
343343
argument in a special way (like `FORMAT_VALUE` or `MAKE_FUNCTION`).
344344

@@ -347,12 +347,12 @@ is already in existence and you do not change the magic number, make
347347
sure to delete your old .py(c|o) files! Even though you will end up changing
348348
the magic number if you change the bytecode, while you are debugging your work
349349
you may be changing the bytecode output without constantly bumping up the
350-
magic number. This can leave you with stale .pyc files that will not be
350+
magic number. This can leave you with stale .pyc files that will not be
351351
recreated.
352352
Running `find . -name '*.py[co]' -exec rm -f '{}' +` should delete all .pyc
353353
files you have, forcing new ones to be created and thus allow you test out your
354-
new bytecode properly. Run `make regen-importlib` for updating the
355-
bytecode of frozen importlib files. You have to run `make` again after this
354+
new bytecode properly. Run `make regen-importlib` for updating the
355+
bytecode of frozen importlib files. You have to run `make` again after this
356356
to recompile the generated C files.
357357

358358
Additional resources

0 commit comments

Comments
 (0)