Add unwinding validation and reporting #274

jbachorik · 2025-09-15T14:37:58Z

What does this PR do?:
This PR adds a bunch of scenarios that are quite challening for the vmstructs unwinder and measures the error rate based on various 'unknown()', 'break_' and 'invalid_' frames.

Motivation:
We want to have a way to see if our changes in the unwinding engine are actually making anything better or not.

Additional Notes:
During my work on this I had to bump the async-profiler base and then reworked the patching to be more 'sane'.

The validation is not blowing up tests (yet) but rather adds the metrics to the CI run report so one can get the glimpse on how each combination of os/arch/java version is doing.
The reports are also stored in the persisted reports so they can be downloaded.

How to test the change?:

For Datadog employees:

If this PR touches code that signs or publishes builds or packages, or handles
credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
This PR doesn't touch any of that.
JIRA: PROF-12462

Unsure? Have a question? Request a review!

# Conflicts: # ddprof-lib/src/main/cpp/stackFrame.h # gradle/lock.properties

pr-commenter · 2025-09-15T14:49:33Z

Benchmarks [aarch64 wall]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.31.0	1.32.0-jb_unwinding_2-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	wall	wall
wall	on	on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 18 metrics, 20 unstable metrics.

pr-commenter · 2025-09-15T16:23:30Z

Benchmarks [aarch64 memleak,alloc]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.31.0	1.32.0-jb_unwinding_2-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	on	on
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	on	on
modes	memleak,alloc	memleak,alloc
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 18 metrics, 20 unstable metrics.

pr-commenter · 2025-09-15T16:24:20Z

Benchmarks [aarch64 alloc]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.31.0	1.32.0-jb_unwinding_2-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	on	on
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	alloc	alloc
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 18 metrics, 20 unstable metrics.

pr-commenter · 2025-09-15T16:24:20Z

Benchmarks [aarch64 cpu,wall,alloc,memleak]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.31.0	1.32.0-jb_unwinding_2-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	on	on
cpu	on	on
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	on	on
modes	cpu,wall,alloc,memleak	cpu,wall,alloc,memleak
wall	on	on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 18 metrics, 20 unstable metrics.

pr-commenter · 2025-09-15T16:24:32Z

Benchmarks [aarch64 cpu,wall]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.31.0	1.32.0-jb_unwinding_2-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	on	on
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	cpu,wall	cpu,wall
wall	on	on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 21 unstable metrics.

pr-commenter · 2025-09-15T16:24:41Z

Benchmarks [x86_64 cpu,wall]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.31.0	1.32.0-jb_unwinding_2-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	on	on
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	cpu,wall	cpu,wall
wall	on	on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 14 metrics, 24 unstable metrics.

pr-commenter · 2025-09-15T16:24:45Z

Benchmarks [x86_64 cpu,wall,alloc,memleak]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.31.0	1.32.0-jb_unwinding_2-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	on	on
cpu	on	on
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	on	on
modes	cpu,wall,alloc,memleak	cpu,wall,alloc,memleak
wall	on	on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 23 unstable metrics.

pr-commenter · 2025-09-15T16:25:22Z

Benchmarks [x86_64 memleak,alloc]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.31.0	1.32.0-jb_unwinding_2-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	on	on
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	on	on
modes	memleak,alloc	memleak,alloc
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 14 metrics, 24 unstable metrics.

pr-commenter · 2025-09-15T16:30:11Z

Benchmarks [x86_64 memleak]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.31.0	1.32.0-jb_unwinding_2-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	on	on
modes	memleak	memleak
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 14 metrics, 24 unstable metrics.

ddprof-test/src/main/java/com/datadoghq/profiler/unwinding/UnwindingMetrics.java

MattAlp · 2025-09-16T16:31:10Z

ddprof-test/src/main/java/com/datadoghq/profiler/unwinding/UnwindingValidator.java

+        TEXT, JSON, MARKDOWN
+    }
+    public enum Scenario {
+        C2_COMPILATION_TRIGGERS("C2CompilationTriggers"),


Could any of the scenarios interfere with the behavior of others, i.e. in the case where we run all of them?

They should not. They are not run in parallel, and the only ones I would expect to be sensitive would be the JIT related ones. But we are JITing methods specific to that scenario, so we should be +- ok.

Of course, with this being dynamic system, it is always with a grain of salt and at least one hand waving, but so far the results look legit.

Ok. This thread can act as a little footnote in case you do see interference-induced failures in the future.

ddprof-test/src/main/java/com/datadoghq/profiler/unwinding/UnwindingValidator.java

pr-commenter · 2025-09-16T17:24:45Z

Benchmarks [x86_64 cpu]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.31.0	1.32.0-jb_unwinding_2-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	on	on
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	cpu	cpu
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 16 metrics, 22 unstable metrics.

pr-commenter · 2025-09-16T17:25:31Z

Benchmarks [aarch64 cpu]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.31.0	1.32.0-jb_unwinding_2-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	on	on
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	cpu	cpu
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 21 unstable metrics.

zhengyu123 · 2025-09-15T19:50:50Z

ddprof-lib/src/main/cpp/codeCache.h

                     const void *min_address = NO_MIN_ADDRESS,
                     const void *max_address = NO_MAX_ADDRESS,
-                     const char* image_base = NULL);
+                     const char* image_base = NULL,


Let's use nullptr instead.

Let's address this in a followup PR.
This one is already pretty big and if I change NULL->nullptr I want to do it consistently and it will cause changes in 47 files 😱
I have the change ready but will post it as a separate PR, together with the update the to tooling such that spotless is enforcing nullptr instead of NULL

zhengyu123 · 2025-09-16T12:33:18Z

ddprof-lib/src/main/cpp/dwarf.h

@@ -1,180 +0,0 @@
-/*


Could you do git mv instead? so can spot the diffs easily.

Next time, perhaps. Unbundling the commits and doing the manual git moves will take quite a lot of time for very little benefit in this case. The dwarf_dd.h is a wrapper around dwarf.h, meaning that I would force git to accept this as a move because the contents are very much not alike.

zhengyu123 · 2025-09-16T12:34:02Z

ddprof-lib/src/main/cpp/dwarf_dd.h

+};
+}
+
+// class DwarfParser {


Remove commented out code?

pr-commenter · 2025-09-16T18:28:42Z

Benchmarks [x86_64 wall]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.31.0	1.32.0-jb_unwinding_2-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	wall	wall
wall	on	on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 14 metrics, 24 unstable metrics.

pr-commenter · 2025-09-16T18:29:07Z

Benchmarks [x86_64 alloc]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.31.0	1.32.0-jb_unwinding_2-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	on	on
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	alloc	alloc
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 14 metrics, 24 unstable metrics.

pr-commenter · 2025-09-16T18:32:30Z

Benchmarks [aarch64 memleak]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.31.0	1.32.0-jb_unwinding_2-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	on	on
modes	memleak	memleak
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 18 metrics, 20 unstable metrics.

jbachorik and others added 8 commits September 15, 2025 16:30

Introduce JIT unwinding validation suite

517121c

# Conflicts: # ddprof-lib/src/main/cpp/stackFrame.h # gradle/lock.properties

Add Claude agent for handling gradle builds

cfaab97

Fix Macos build

a40516d

Adjust for current async-profiler upstream base

b79d628

Try installing debug symbols in CI

c599682

Spotless!

afbfb26

Fixing tests

0b6f7b3

Fix CI report naming

d1bd2b1

jbachorik added no-release-notes generated: claude labels Sep 15, 2025

Cleanup the PR

4443f50

jbachorik mentioned this pull request Sep 16, 2025

Add comprehensive unwinding validation with advanced reporting #269

Closed

3 tasks

jbachorik requested review from MattAlp and zhengyu123 September 16, 2025 15:37

MattAlp reviewed Sep 16, 2025

View reviewed changes

Fix typo

10e150e

zhengyu123 requested changes Sep 16, 2025

View reviewed changes

zhengyu123 approved these changes Sep 17, 2025

View reviewed changes

jbachorik merged commit 21523a3 into main Sep 17, 2025
95 checks passed

jbachorik deleted the jb/unwinding_2 branch September 17, 2025 13:29

github-actions bot added this to the 1.32.0 milestone Sep 17, 2025

zhengyu123 mentioned this pull request Sep 22, 2025

Bump ddprof to 1.32.0 DataDog/dd-trace-java#9584

Merged

Add unwinding validation and reporting #274

Add unwinding validation and reporting #274

Uh oh!

Conversation

jbachorik commented Sep 15, 2025 • edited by atlassian bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pr-commenter bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [aarch64 wall]

Parameters

Summary

Uh oh!

pr-commenter bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [aarch64 memleak,alloc]

Parameters

Summary

Uh oh!

pr-commenter bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [aarch64 alloc]

Parameters

Summary

Uh oh!

pr-commenter bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [aarch64 cpu,wall,alloc,memleak]

Parameters

Summary

Uh oh!

pr-commenter bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [aarch64 cpu,wall]

Parameters

Summary

Uh oh!

pr-commenter bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [x86_64 cpu,wall]

Parameters

Summary

Uh oh!

pr-commenter bot commented Sep 15, 2025

Benchmarks [x86_64 cpu,wall,alloc,memleak]

Parameters

Summary

Uh oh!

pr-commenter bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [x86_64 memleak,alloc]

Parameters

Summary

Uh oh!

pr-commenter bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [x86_64 memleak]

Parameters

Summary

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pr-commenter bot commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [x86_64 cpu]

Parameters

Summary

Uh oh!

pr-commenter bot commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [aarch64 cpu]

Parameters

Summary

Uh oh!

jbachorik commented Sep 15, 2025 •

edited by atlassian bot

Loading

pr-commenter bot commented Sep 15, 2025 •

edited

Loading

pr-commenter bot commented Sep 15, 2025 •

edited

Loading

pr-commenter bot commented Sep 15, 2025 •

edited

Loading

pr-commenter bot commented Sep 15, 2025 •

edited

Loading

pr-commenter bot commented Sep 15, 2025 •

edited

Loading

pr-commenter bot commented Sep 15, 2025 •

edited

Loading

pr-commenter bot commented Sep 15, 2025 •

edited

Loading

pr-commenter bot commented Sep 15, 2025 •

edited

Loading

pr-commenter bot commented Sep 16, 2025 •

edited

Loading

pr-commenter bot commented Sep 16, 2025 •

edited

Loading