-
Notifications
You must be signed in to change notification settings - Fork 151
Add jemalloc heap profiling support #3923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4d0fd64 to
82bdce5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds optional jemalloc heap profiling capabilities to production services for debugging memory issues without requiring restarts or separate profiling builds. The feature is controlled by a jemalloc-profiling feature flag and exposes Unix sockets for on-demand heap dump collection.
Key changes:
- New
heap_dump_handlermodule that spawns a Unix socket server for heap dump requests - Feature flag integration across 6 services with conditional jemalloc allocator
- Docker and CI infrastructure updates to support profiling builds
Reviewed changes
Copilot reviewed 24 out of 25 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/observe/src/heap_dump_handler.rs | New module implementing Unix socket server for heap dump collection via jemalloc_pprof |
| crates/observe/src/lib.rs | Conditionally exports heap_dump_handler module under jemalloc-profiling feature |
| crates/observe/Cargo.toml | Adds jemalloc_pprof dependency and jemalloc-profiling feature flag |
| crates/*/src/main.rs | Conditionally switches global allocator from mimalloc to jemalloc based on feature flag |
| crates/*/src/run.rs | Spawns heap dump handler during service initialization |
| crates/*/Cargo.toml | Adds jemalloc-profiling feature flag with required dependencies |
| Cargo.toml | Adds workspace-level jemalloc dependencies and updates tempfile version |
| Cargo.lock | Lock file updates for new dependencies |
| Dockerfile | Adds build arguments for feature flags and installs netcat-openbsd in final image |
| .github/workflows/deploy-profiling.yaml | New CI workflow for building and deploying profiling-enabled images |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 24 out of 25 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
MartinquaXD
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this should be very helpful!
Did you already confirm that the generated dumps are helpful? I remember you generated some in the past but I don't recall if they also used jemalloc. Can this reasonably run in prod without causing significant overhead or is this still problematic?
If overhead is a concern I think we should be able to use the socket to start profiling on demand as well (instead of always profiling and only dumping on demand).
| // Check if jemalloc profiling is available before spawning the handler | ||
| // This prevents panics that would crash the entire process | ||
| let profiling_available = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best to add #[cfg(feature = "jemalloc-profiling")] to the function. You technically still need this check in case we somehow mess up and don't use the jemalloc allocator despite enabling the feature but it better expresses the requirements for this function to work IMO.
Also why does this check have to happen inside catch_unwind? Seems like this should be doable without panicking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During my tests, jemalloc_pprof::PROF_CTL panicked when a wrong config was provided, since the lib internally uses unsafe operations:
let prof_enabled: bool = unsafe { raw::read(b"opt.prof\0") }.unwrap();There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gated the function with the feature flag.
| let profiling_available = | ||
| std::panic::catch_unwind(|| jemalloc_pprof::PROF_CTL.as_ref().is_some()).unwrap_or(false); | ||
|
|
||
| if !profiling_available { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I misunderstand something we'll never hit this case with the current code, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If jemalloc_pprof::PROF_CTL panics, we return false and then just print a log.
| } | ||
| } | ||
|
|
||
| fn binary_name() -> Option<String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like some of the code was copy and pasted from the log filter socket stuff. Given that the new code also lives inside observe we should be able to resuse some of it, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can. I wanted to do the refactoring in a separate PR.
|
|
||
| match prof_ctl.dump_pprof() { | ||
| Ok(pprof_data) => { | ||
| tracing::info!(size_bytes = pprof_data.len(), "heap dump generated"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at this point we can drop the lock on prof_ctl.
Also would be nice to know how long we held the lock here.
Also what does it mean for the remaining process to hold this lock? Are we unable to get new allocations while this lock is held?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, done
| } | ||
| Err(_) => { | ||
| handle.abort(); | ||
| tracing::error!("heap dump request timed out after 5 minutes"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| tracing::error!("heap dump request timed out after 5 minutes"); | |
| tracing::error!(?time, "heap dump request timed out"); |
Probably nice to store the timeout in a variable and reference it here so we don't run into the case that someone changes the timeout but not the log.
Also how long is this process expected to run? 5 minutes seems like a very long time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also how long is this process expected to run? 5 minutes seems like a very long time.
I am not sure. Reduced to 1 min, since the report should be ready in a few seconds.
| WORKDIR /src/ | ||
|
|
||
| # Accept build arguments for enabling features | ||
| ARG CARGO_BUILD_FEATURES="" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does value actually get populated? Seems like we always set it to "".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not always. The CI job uses it to build an image with profiler
| CARGO_BUILD_FEATURES=--features jemalloc-profiling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, the CI's --build-arg overrides the Dockerfile's default.
| /// Usage: | ||
| /// ```bash | ||
| /// # From your local machine (one-liner): | ||
| /// kubectl exec <pod> -n <namespace> -- sh -c "echo dump | nc -U /tmp/heap_dump_<binary_name>.sock" > heap.pprof | ||
| /// | ||
| /// # Analyze with pprof: | ||
| /// go tool pprof -http=:8080 heap.pprof | ||
| /// ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please also document this in the README.md?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
This was already tested in prod a few months ago with no issue, but using a different version, where a simple API endpoint was used instead of Unix sockets, which is considered a more concise approach. I'll be testing it in shadow and staging for a few days. Once confident with that, I will do the same in prod. Only Jemalloc is currently possible. |
Background
Production memory issues are difficult to debug without heap profiling. We need on-demand memory dumps that can be collected from running services without restarts or separate profiling builds.
Description
Adds optional jemalloc heap profiling behind a
jemalloc-profilingfeature flag. When enabled, services expose a Unix socket that accepts dump requests and streams pprof-compatible heap profiles back.Changes
jemalloc-profilingfeature flag to all 6 servicesjemalloc_pprofandtikv-jemallocatorheap_dump_handler.rsmodule exposing Unix socket at/tmp/heap_dump_<service>.socknetcat-openbsdfor socket communication(also useful for existing logs handler).github/workflows/deploy-profiling.yamlfor building profiling imagesServices affected: orderbook, autopilot, driver, solvers. alerter and refunder are also affected, mostly to simplify the Dockerfile.
Usage
Build with profiling.
deploy-profilingCI job.Collect heap dump from Kubernetes:
How to test
Already tested on sepolia-staging. Requires bumping the memory a bit. Will continue testing in the shadow env.