refactor(profiler): clear frozen map instead of using invalid iterator#13353
Merged
refactor(profiler): clear frozen map instead of using invalid iterator#13353
Conversation
The cwisstable docs say conflicting things about what happens to an iterator after an erase_at call. The map_api.h example from the repo says the iterator is invalid. The internal CWISS_RawTable_erase_at function, which is source of truth, says the iterator is invalid after the call. Fix memalloc_heap_map_destructive_copy to just clear after copying over the entries rather than erasing the iterator. Note that this is currently dead code. We take the heap profiler lock prior to freezing the heap profiler and emptying it out. So we can't add an entry to the frozen array because we'll fail to acquire the lock needed to do so. This means the map is empty and we won't see the invalidated iterator. However, this code will run once we rework the locking. I saw this code crash when pulling in the heap map code into my work-in-progress branch where it actually runs. So we don't technically _need_ to fix this right now. However, I'd like the code on main to be right even if it's not running yet :)
Contributor
|
|
Contributor
Author
|
No changelog needed since this code hasn't been released yet. I'll port this fix to #13350 as well. |
taegyunkim
approved these changes
May 7, 2025
Contributor
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 235 ± 2 ms. The average import time from base is: 237 ± 3 ms. The import time difference between this PR and base is: -1.9 ± 0.1 ms. Import time breakdownThe following import paths have shrunk:
|
Merged
2 tasks
BenchmarksBenchmark execution time: 2025-05-07 19:49:15 Comparing candidate commit 90cc616 in PR branch Found 0 performance improvements and 2 performance regressions! Performance is the same for 285 metrics, 5 unstable metrics. scenario:iast_aspects-ospathbasename_aspect
scenario:iast_aspects-ospathsplitext_aspect
|
taegyunkim
pushed a commit
that referenced
this pull request
May 12, 2025
…backport 2.21] (#13317) (#13350) Backports #13317 to 2.21, including the fix from #13353 The heap profiler component of memalloc uses a plain array to track allocations. It performs a linear scan of this array on every free to see if the freed allocation is tracked. This means the cost of freeing an allocation increases substantially for programs with large heaps. Switch to a hash table for tracking allocations. This should have roughly constant overhead even for larger heaps, where we track more allocations. This PR uses the C implementation of the [Abseil SwissTable](https://abseil.io/blog/20180927-swisstables) from https://github.com/google/cwisstable. This hash table is designed for memory efficiency, both in terms of memory used (1 byte per entry of metadata) and in terms of access (cache-friendly layout). The table is designed in particular for fast lookup and insertion. We do a lookup for every single free, so it's important that it's fast. This PR previously tried to use a C++ `std::unordered_map`, but it proved far too difficult to integrate with the existing C `memalloc` implementation. This PR includes the single `cwisstable.h` header with the following modifications, which can be found in the header by searching for `BEGIN MODIFICATION`: - Fix `CWISS_Mul128`: for 64-bit windows we need a different intrinsic, and for 32-bit systems we can't use it at all - Fix a multiple declaration and invalid C++ syntax in `CWISS_LeadingZeroes64` for 32-bit Windows The PR also supplies a 32-bit compatible hash function since the one in `cwisstable.h` couldn't be easily made to work. We use `xxHash` instead. Note that `cwisstable.h` requires the hash function to return a `size_t`, which is 32 bits on 32-bit platforms, but the SwissTable design expects 64-bit hashes. I don't know if this is a huge problem in practice or if people even use our profiler on 32-bit systems, but it's something to watch out for. Note that the `cwisstable.h` hash table implementation basically never gets rid of backing memory unless we completely clear the table. This is not a big deal in practice. Each entry in the table is an 8 byte `void*` key and an 8 byte `traceback_t*`, plus 1 byte of control metadata. If we assume a target load factor of 50%, and we keep the existing `2^16` element cap, then a completely full table is only about 2MiB. Most of the heap profiler memory usage comes from the tracebacks themselves, which we _do_ free when they're removed from the profiler structures. This PR also fixes a potential memory leak. Since we use try-locks, it theoretically possible to fail to un-track an allocation if we can't acquire the lock. When this happens, the previous implementation would just leak an entry. With the hash table, if we find an existing entry for a given address, we can free the old traceback and replace it with the new one. The real fix will be to fix how we handle locking, but this PR improves the situation. I've tested this locally and see very good performance even for larger heaps, and this has also been tested in #13307. Fixes #13307 ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
mabdinur
pushed a commit
that referenced
this pull request
May 19, 2025
#13353) The cwisstable docs say conflicting things about what happens to an iterator after an erase_at call. The map_api.h example from the repo says the iterator is invalid. The internal CWISS_RawTable_erase_at function, which is source of truth, says the iterator is invalid after the call. Fix memalloc_heap_map_destructive_copy to just clear after copying over the entries rather than erasing the iterator. Note that this is currently dead code. We take the heap profiler lock prior to freezing the heap profiler and emptying it out. So we can't add an entry to the frozen array because we'll fail to acquire the lock needed to do so. This means the map is empty and we won't see the invalidated iterator. However, this code will run once we rework the locking. I saw this code crash when pulling in the heap map code into my work-in-progress branch where it actually runs. So we don't technically _need_ to fix this right now. However, I'd like the code on main to be right even if it's not running yet :)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The cwisstable docs say conflicting things about what happens to an
iterator after an erase_at call. The map_api.h example from the repo
says the iterator is invalid. The internal CWISS_RawTable_erase_at
function, which is source of truth, says the iterator is invalid after
the call. Fix memalloc_heap_map_destructive_copy to just clear after
copying over the entries rather than erasing the iterator.
Note that this is currently dead code. We take the heap profiler lock
prior to freezing the heap profiler and emptying it out. So we can't add
an entry to the frozen array because we'll fail to acquire the lock
needed to do so. This means the map is empty and we won't see the
invalidated iterator. However, this code will run once we rework the
locking. I saw this code crash when pulling in the heap map code into my
work-in-progress branch where it actually runs. So we don't technically
need to fix this right now. However, I'd like the code on main to be
right even if it's not running yet :)
Checklist
Reviewer Checklist