Skip to content

Conversation

@henryiii
Copy link
Contributor

@henryiii henryiii commented Nov 27, 2025

This makes the __str__ method faster, about 10%. This is used quite a bit, so saving some time here is useful.

@henryiii henryiii force-pushed the henryiii/perf/str branch 3 times, most recently from 267b5d9 to e584ec3 Compare November 27, 2025 16:14
@henryiii
Copy link
Contributor Author

henryiii commented Nov 27, 2025

Almost all the savings here was from avoiding the NamedTuple indirection. This is now only 1% faster total time, probably 5% or something like that faster for the str operation. Saving the intermediate value doesn't have any measurable effect anymore, so I've remove that.

Now it's mostly up to if you think this looks better (and it still is a little faster).

@brettcannon
Copy link
Member

Now it's mostly up to if you think this looks better

I'm indifferent.

@notatallshaw
Copy link
Member

"".join(...) reads better to me because I've written that pattern so often in Python, but that's just anecdotal.

I was also under the impression, apparently incorrectly, that join would be faster. Because naively concatenating strings can be O(n^2) with regards to memory allocation operations, and I thought the .join method had some kind of optimization to handle that. Maybe this is just too few concatenations with too small strings where the memory allocations become a dominating factor.

@henryiii
Copy link
Contributor Author

I'm nearly sure it's the fact the strings are generally small. If they were large I'm almost sure it would be the other way around.

I played around with several ways to do this - I thought making all four separately then using an f-string to join them would be fastest, but short circuiting if None was too important. Now that the largest cost (accessing the nested field in the NamedTuple) is gone, it's possible that is faster.

@brettcannon
Copy link
Member

I was also under the impression, apparently incorrectly, that join would be faster. Because naively concatenating strings can be O(n^2) with regards to memory allocation operations, and I thought the .join method had some kind of optimization to handle that. Maybe this is just too few concatenations with too small strings where the memory allocations become a dominating factor.

There's an optimization in CPython specifically for += in a loop. So you're right that str.join() should be faster, but we cheated in CPython. 😁

@henryiii henryiii force-pushed the henryiii/perf/str branch 4 times, most recently from 624a237 to 4a4953d Compare January 5, 2026 16:50
@henryiii
Copy link
Contributor Author

henryiii commented Jan 5, 2026

Okay, after one more change (inlining base_version), now this is much faster, around 10%.

All benchmarks:

Change Before [3803ce3] After [4a4953d] <henryiii/perf/str> Ratio Benchmark (Parameter)
2.30±0.01ms 2.31±0.01ms 1.01 markers.TimeMarkerSuite.time_constructor
1.20±0.02ms 1.15±0.05ms 0.95 markers.TimeMarkerSuite.time_evaluate
9.88±0.4ms 9.68±0.7ms 0.98 requirement.TimeRequirementSuite.time_constructor
601±8μs 605±70μs 1.01 resolver.TimeResolverSuite.time_resolver_loop
3.43±0.04ms 3.33±0.03ms 0.97 specifiers.TimeSpecSuite.time_constructor
4.09±0.02ms 4.00±0.03ms 0.98 specifiers.TimeSpecSuite.time_contains
61.5±0.3μs 61.2±0.3μs 1 specifiers.TimeSpecSuite.time_filter
3.99±0.04μs 4.04±0.04μs 1.01 utils.TimeUtils.time_canonicalize_name
1.97±0.01ms 1.98±0ms 1.01 version.TimeVersionSuite.time_constructor
1.87±0.01ms 1.87±0.01ms 1 version.TimeVersionSuite.time_sort
- 811±6μs 728±9μs 0.9 version.TimeVersionSuite.time_str

Signed-off-by: Henry Schreiner <henryfs@princeton.edu>

chore: remove saving intermediate (no longer much faster)

Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
@henryiii henryiii merged commit 65092ce into pypa:main Jan 6, 2026
40 checks passed
@henryiii henryiii deleted the henryiii/perf/str branch January 6, 2026 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants