#12439 Change _observers from list to set #12440

allrob23 · 2025-03-24T23:59:14Z

Scope and purpose

This PR improves the performance of the LogPublisher class by replacing the list used for storing observers with a set. This change optimizes the lookup operation from O(N) to O(1) when checking if an observer is already registered. The scope of this PR is limited to this performance improvement and does not introduce any behavioral changes to the class.

Contributor Checklist:

This process applies to all pull requests - no matter how small.
Have a look at our developer documentation before submitting your Pull Request.

Below is a non-exhaustive list (as a reminder):

The title of the PR should describe the changes and starts with the associated issue number, like “#9782 Remove twisted.news. #1234 Brief description”.
A release notes news fragment file was create in src/twisted/newsfragments/ (see: Release notes fragments docs.)
The automated tests were updated.
Once all checks are green, request a review by leaving a comment that contains exactly the string please review.
Our bot will trigger the review process, by applying the pending review label
and requesting a review from the Twisted dev team.

codspeed-hq · 2025-03-25T00:02:12Z

CodSpeed Performance Report

Merging #12440 will degrade performances by 6.61%

_{Comparing allrob23:trunk (a4caa4b) with trunk (aedee3e)}

Summary

⚡ 2 improvements
❌ 1 regressions
✅ 33 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
⚡	`test_log_publisher_add_remove`	137.2 ms	2.4 ms	×57
❌	`test_log_publisher_call_dispatch`	1.3 ms	1.4 ms	-6.61%
⚡	`test_http11_server_chunked_request`	1.8 ms	1.8 ms	+2.31%

glyph

Hi @allrob23 !

A couple of notes:

obviously the fact that this breaks the test suite needs to be addressed :)
please make a dedicated branch for this feature rather than using your fork's trunk branch for it
can you do this in 2 PRs, where the first adds a benchmark, and the second improves the benchmark? if this is a performance-oriented fix, given that we are lucky enough to have a performance regression suite on https://codspeed.io, we should use that suite to make sure it actually makes the difference that we think it does.

glyph · 2025-03-25T01:07:58Z

And, let's not forget the most important part: thank you for taking the time to contribute to Twisted.

allrob23 · 2025-03-25T15:35:14Z

Hi @glyph, do you have any insights on this test failure? Could it be a flaky test? It’s passing in my environment:

Test: test_trace
File: src/twisted/logger/test/test_observer.py

adiroiban · 2025-03-25T16:02:56Z

@allrob23 The twisted/twisted project is not using pytest to run the tests.

The tests are executed using twisted trial tool. Some docs is here

You can use trial directly or use tox

glyph · 2025-03-25T21:35:59Z

Running the test in isolation, I also see it pass, but only once: if I do trial -u twisted.logger.test.test_observer it quickly fails (on the second observer). see here:

glyph@gale:~/Projects/Twisted  🏰 Twisted  🔀 allrob23/trunk  🐚 🔑   🐳 🫛  ⮐
  ★ trial -u twisted.logger.test.test_observer
 ↩ Tue Mar 25 14:32:01 PDT 2025
Test Pass 1
twisted.logger.test.test_observer
  LogPublisherTests
    test_addObserver ...                                                   [OK]
    test_addObserverNotCallable ...                                        [OK]
    test_fanOut ...                                                        [OK]
    test_interface ...                                                     [OK]
    test_observerRaises ...                                                [OK]
    test_observerRaisesAndLoggerHatesMe ...                                [OK]
    test_observers ...                                                     [OK]
    test_removeObserver ...                                                [OK]
    test_removeObserverNotRegistered ...                                   [OK]
    test_trace ...                                                         [OK]

-------------------------------------------------------------------------------
Ran 10 tests in 0.002s

PASSED (successes=10)
Test Pass 2
twisted.logger.test.test_observer
  LogPublisherTests
    test_addObserver ...                                                   [OK]
    test_addObserverNotCallable ...                                        [OK]
    test_fanOut ...                                                        [OK]
    test_interface ...                                                     [OK]
    test_observerRaises ...                                                [OK]
    test_observerRaisesAndLoggerHatesMe ...                                [OK]
    test_observers ...                                                     [OK]
    test_removeObserver ...                                                [OK]
    test_removeObserverNotRegistered ...                                   [OK]
    test_trace ...                                                       [FAIL]

===============================================================================
[FAIL]
Traceback (most recent call last):
  File "/Users/glyph/Projects/Twisted/src/twisted/logger/test/test_observer.py", line 191, in test_trace
    self.assertEqual(traces[1], ((publisher, o1),))
  File "/Users/glyph/Projects/Twisted/src/twisted/trial/_synctest.py", line 444, in assertEqual
    super().assertEqual(first, second, msg)
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/unittest/case.py", line 907, in assertEqual
    assertion_func(first, second, msg=msg)
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/unittest/case.py", line 1124, in assertTupleEqual
    self.assertSequenceEqual(tuple1, tuple2, msg, seq_type=tuple)
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/unittest/case.py", line 1095, in assertSequenceEqual
    self.fail(msg)
twisted.trial.unittest.FailTest: Tuples differ: ((<tw[104 chars]ls>.o2 at 0x104fe2980>), (<twisted.logger._obs[106 chars]60>)) != ((<tw[104 chars]ls>.o1 at 0x104fe2b60>),)

First differing element 0:
(<twi[63 chars]ction LogPublisherTests.test_trace.<locals>.o2 at 0x104fe2980>)
(<twi[63 chars]ction LogPublisherTests.test_trace.<locals>.o1 at 0x104fe2b60>)

First tuple contains 1 additional elements.
First extra element 1:
(<twisted.logger._observer.LogPublisher object at 0x104f7d9b0>, <function LogPublisherTests.test_trace.<locals>.o1 at 0x104fe2b60>)

  ((<twisted.logger._observer.LogPublisher object at 0x104f7d9b0>,
-   <function LogPublisherTests.test_trace.<locals>.o2 at 0x104fe2980>),
-  (<twisted.logger._observer.LogPublisher object at 0x104f7d9b0>,
-   <function LogPublisherTests.test_trace.<locals>.o1 at 0x104fe2b60>))
+   <function LogPublisherTests.test_trace.<locals>.o1 at 0x104fe2b60>),)
?                                                                      +


twisted.logger.test.test_observer.LogPublisherTests.test_trace
-------------------------------------------------------------------------------
Ran 10 tests in 0.002s

FAILED (failures=1, successes=9)

The behavior here is pretty predictable, which is that by moving observers into a set(), their previously deterministic order of execution is now non-deterministic. This seems like a potential problem for unit tests in other parts of Twisted as well; I don't think you can make this change exactly as-is without potentially making large numbers of other tests unreliable. Perhaps we need both a list and a set()? But then of course it's more work to add and remove observers.

All the more reason that we should really start performance work with a benchmark, since maintaining reliability here adds a potentially complex tradeoff.

allrob23 · 2025-03-25T21:46:09Z

I understand, thanks for the explanation, maybe adding a sorted() in the test assertion of the tests will solve the problem of reliability?

allrob23 · 2025-03-26T12:33:25Z

I understand the overall message. I'm short on time right now, but I'll take care of it as soon as I’m available.

glyph · 2025-03-26T21:45:13Z

I understand, thanks for the explanation, maybe adding a sorted() in the test assertion of the tests will solve the problem of reliability?

It's possible that code was relying upon the ordering of log observers previously, and it's not entirely clear to me that they would have been wrong to do so. If we want to make the ordering non-deterministic, we should probably have some tooling to allow developers to test with an intentionally randomized or at least intentionally-different order (perhaps reversed) before unilaterally changing the behavior.

for more information, see https://pre-commit.ci

allrob23 · 2025-04-30T00:31:33Z

I ran this benchmark separately in PR #12445, and on my machine the results were encouraging.

After the changes, the observer add/remove performance improved significantly, especially for larger sets.

Before:

After:

allrob23 · 2025-04-30T00:31:58Z

please review

adiroiban

Thanks for the PR.

I see that a performance test is failing, this needs to be fixed before merging.

I have never used the Twisted logger in production so I am not familiar with it's use cases.

In the documentation I see

Furthermore, no guarantees are made as to the order in which observers are called, so the effect of such modifications on other observers may be non-deterministic.

src/twisted/logger/_observer.py

adiroiban · 2025-05-02T08:47:57Z

src/twisted/logger/test/test_observer.py


-        self.assertEqual(traces[1], ((publisher, o1),))
-        self.assertEqual(traces[2], ((publisher, o1), (publisher, o2)))
+        self.assertIn((publisher, o1), traces[1])


Just asking. Can we use the stdlib assertCountEqual here ?

To be honest, I don't know if it would work to use assertCountEqual here

allrob23 · 2025-05-02T14:49:35Z

About the performance check:

The change in test_log_publisher_call_dispatch (from 1.3 ms to 1.4 ms, -6.71%) justifies the significant performance gains in other benchmarks. Notably, test_log_publisher_add_remove improved dramatically from 137.2 ms to 2.4 ms (57x faster), and test_lineReceiver[64] gained 3.25% (8.5 ms to 8.2 ms).

BTW, on my machine, the test_log_publisher_call_dispatch slowdown does not occur.

Further highlight the overall positive impact. @adiroiban what do you think?

adiroiban · 2025-05-03T14:45:28Z

Sorry for the delay.
I don't know what to say here.

This PR has some improvemtns, but at the same time it contains a regression.

I don't have much experience with writing/handling performance tests.

Maybe there is a defect in the way test_log_publisher_call_dispatch is written.

I think that log publishing is much more important than add/remove of observers... and it matters that we have a performance issue here.

My suggestion is try to get more eyes for this PR by asking for feedback over the mailing list https://mail.python.org/mailman3/lists/twisted.python.org/

glyph · 2025-05-03T21:24:58Z

I think that log publishing is much more important than add/remove of observers... and it matters that we have a performance issue here.

I am also inclined to think that this is the more significant code path in most applications, but obviously I'd prefer it if both were fast.

@allrob23 — I guess I'm curious for a more in-depth explanation of the motivating use-case for #12439 — is there a type of application where you want to add and remove log observers in a tight loop, and/or is there a place where Twisted is doing this itself leading to some other, higher-level slowdown?

Thanks!

allrob23 · 2025-05-04T15:36:46Z

Thank you for the responses! I quickly noticed that a benchmark test was incorrect, and the DummyObserver was taking up some time. I opened another PR (#12446) to fix this. I'm a bit busy with other tasks right now, but I can dive into the details of how we spotted these issues later. Please review PR #12446, and we can revisit this discussion once it's merged.

glyph

Looks like we are blocked on #12465 to get this to the point where we could even reasonably ack any perf regression, so I am taking this out of the queue for now.

glyph requested changes Mar 25, 2025

View reviewed changes

allrob23 added 3 commits April 28, 2025 19:41

perf: change _observers from list to set

8512de1

feat: add misc changelog

7160759

feat: just trying set of lookup and list to store

68f87b5

allrob23 force-pushed the trunk branch from 8dd7b72 to 68f87b5 Compare April 28, 2025 23:41

allrob23 and others added 3 commits April 28, 2025 19:52

small

4c8927a

just try

b1ff8d1

[pre-commit.ci] auto fixes from pre-commit.com hooks

d113742

for more information, see https://pre-commit.ci

allrob23 mentioned this pull request Apr 30, 2025

#12440 Add benchmark for LogPublisher #12445

Merged

chevah-robot added the needs-review label Apr 30, 2025

chevah-robot requested a review from a team April 30, 2025 00:32

adiroiban added a commit that referenced this pull request May 2, 2025

#12440 Add benchmark for LogPublisher (#12445)

f0027bf

Merge branch 'trunk' into trunk

81a1a56

adiroiban requested changes May 2, 2025

View reviewed changes

chevah-robot added needs-changes and removed needs-review labels May 2, 2025

refactor: remove unused ValueError

e053f88

allrob23 mentioned this pull request May 4, 2025

#12439 change parts that may be affecting the benchmark #12446

Closed

chevah-robot added needs-review and removed needs-changes labels May 4, 2025

chevah-robot requested a review from a team May 4, 2025 15:36

Merge branch 'trunk' into trunk

a4caa4b

glyph requested changes Aug 11, 2025

View reviewed changes

chevah-robot added needs-changes and removed needs-review labels Aug 11, 2025

glyph mentioned this pull request Aug 12, 2025

Optimize LogPublisher.addObserver by using a set instead of a list #12439

Open

#12439 Change _observers from list to set #12440

Are you sure you want to change the base?

#12439 Change _observers from list to set #12440

Conversation

allrob23 commented Mar 24, 2025

Scope and purpose

Contributor Checklist:

Uh oh!

codspeed-hq bot commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #12440 will degrade performances by 6.61%

Summary

Benchmarks breakdown

Uh oh!

glyph left a comment

Choose a reason for hiding this comment

Uh oh!

glyph commented Mar 25, 2025

Uh oh!

allrob23 commented Mar 25, 2025

Uh oh!

adiroiban commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glyph commented Mar 25, 2025

Uh oh!

allrob23 commented Mar 25, 2025

Uh oh!

allrob23 commented Mar 26, 2025

Uh oh!

glyph commented Mar 26, 2025

Uh oh!

allrob23 commented Apr 30, 2025

Uh oh!

allrob23 commented Apr 30, 2025

Uh oh!

adiroiban left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adiroiban May 2, 2025

Choose a reason for hiding this comment

Uh oh!

allrob23 May 2, 2025

Choose a reason for hiding this comment

Uh oh!

allrob23 commented May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adiroiban commented May 3, 2025

Uh oh!

glyph commented May 3, 2025

Uh oh!

allrob23 commented May 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glyph left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codspeed-hq bot commented Mar 25, 2025 •

edited

Loading

adiroiban commented Mar 25, 2025 •

edited

Loading

allrob23 commented May 2, 2025 •

edited

Loading

allrob23 commented May 4, 2025 •

edited

Loading