Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
effc07c
Fix the Claude subagent
jbachorik Oct 13, 2025
b34087f
Add atomic critical section management
jbachorik Oct 17, 2025
7ff50b7
Enhance call trace storage with triple-buffering and hazard pointers
jbachorik Oct 17, 2025
007123d
Add enhanced testing infrastructure with crash handlers
jbachorik Oct 17, 2025
188ec09
Update profiling components with critical section integration
jbachorik Oct 17, 2025
cef3c57
Update native unit tests with enhanced infrastructure
jbachorik Oct 17, 2025
30106a6
Update build configuration for enhanced architecture
jbachorik Oct 17, 2025
1303c33
Document architectural enhancements in README
jbachorik Oct 17, 2025
e1c197a
Fix semantic alignment for ASAN compatibility
jbachorik Oct 20, 2025
1ac89a2
Add alignment documentation comments
jbachorik Oct 20, 2025
7bebba4
Fix ASAN stack allocation alignment issues in tests
jbachorik Oct 20, 2025
6192e94
Fix trailing whitespace in README
jbachorik Oct 20, 2025
c872cbc
Update the async-profiler lock
jbachorik Oct 20, 2025
d9dd7ad
Add arch-doc for CallTraceStorage
jbachorik Oct 20, 2025
4e0361c
Adjust the calltrace storage stresstests to how it is used IRL
jbachorik Oct 21, 2025
24e82ad
WIP
jbachorik Oct 21, 2025
12ebae6
Code cleanup based on the review comments
jbachorik Oct 21, 2025
98c2be2
Code cleanup based on the review comments
jbachorik Oct 21, 2025
3a27e09
Avoid TLS variables accessible from signal handlers
jbachorik Oct 21, 2025
d333fca
Relax InstanceIdTraceIdStressTest to account for hash duplication
jbachorik Oct 21, 2025
e020c8e
Use RAAI for CallTraceStorage hazard pointers
jbachorik Oct 22, 2025
6e31b15
Merge remote-tracking branch 'origin/jb/trace_storage_fix' into jb/tr…
jbachorik Oct 22, 2025
00d5f95
Memory order adjustments
jbachorik Oct 22, 2025
7c41c85
Method rename
jbachorik Oct 22, 2025
fc17277
Add comments about possible race in hazard pointers
jbachorik Oct 22, 2025
1e7d978
More memory order changes
jbachorik Oct 22, 2025
37ae627
Safely solve hazard pointers slot collisions
jbachorik Oct 23, 2025
b4bcc97
Avoid possible allocation in signal handler due to various std::hash …
jbachorik Oct 23, 2025
7f16d85
Unrelated J9 constantly failing tests - debugging some hypothesis
jbachorik Oct 23, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 75 additions & 0 deletions .claude/commands/build-and-summarize
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
#!/usr/bin/env bash
set -euo pipefail

mkdir -p build/logs build/reports/claude .claude/out
STAMP="$(date +%Y%m%d-%H%M%S)"

# Args (default to 'build')
ARGS=("$@")
if [ "${#ARGS[@]}" -eq 0 ]; then
ARGS=(build)
fi

# Label for the log file from the first arg
LABEL="$(printf '%s' "${ARGS[0]}" | tr '/:' '__')"
LOG="build/logs/${STAMP}-${LABEL}.log"

# Ensure we clean the tail on exit
tail_pid=""
cleanup() { [ -n "${tail_pid:-}" ] && kill "$tail_pid" 2>/dev/null || true; }
trap cleanup EXIT INT TERM

echo "▶ Logging full Gradle output to: $LOG"
echo "▶ Running: ./gradlew ${ARGS[*]} -i --console=plain"
echo " (Console output here is minimized; the full log is in the file.)"
echo

# Start Gradle fully redirected to the log (no stdout/stderr to this session)
# Use stdbuf to make the output line-buffered in the log for timely tailing.
( stdbuf -oL -eL ./gradlew "${ARGS[@]}" -i --console=plain ) >"$LOG" 2>&1 &
gradle_pid=$!

# Minimal live progress: follow the log and print only interesting lines
# - Task starts
# - Final build status
# - Test summary lines
tail -n0 -F "$LOG" | awk '
/^> Task / { print; fflush(); next }
/^BUILD (SUCCESSFUL|FAILED)/ { print; fflush(); next }
/[0-9]+ tests? (successful|failed|skipped)/ { print; fflush(); next }
' &
tail_pid=$!

# Wait for Gradle to finish
wait "$gradle_pid"
status=$?

# Stop the tail and print a compact summary from the log
kill "$tail_pid" 2>/dev/null || true
tail_pid=""

echo
echo "=== Summary ==="
# Grab the last BUILD line and nearest test summary lines
awk '
/^BUILD (SUCCESSFUL|FAILED)/ { lastbuild=$0 }
/[0-9]+ tests? (successful|failed|skipped)/ { tests=$0 }
END {
if (lastbuild) print lastbuild;
if (tests) print tests;
}
' "$LOG" || true

echo
if [ $status -eq 0 ]; then
echo "✔ Gradle completed. Full log at: $LOG"
else
echo "✖ Gradle failed with status $status. Full log at: $LOG"
fi

# Hand over to your logs analyst agent — keep the main session output tiny.
echo
echo "Delegating to gradle-logs-analyst agent…"
# If your CLI supports non-streaming, set it here to avoid verbose output.
# Example (uncomment if supported): export CLAUDE_NO_STREAM=1
claude "Act as the gradle-logs-analyst agent to parse the build log at: $LOG. Generate the required Gradle summary artifacts as specified in the gradle-logs-analyst agent definition."
34 changes: 4 additions & 30 deletions .claude/commands/build-and-summarize.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,7 @@
---
description: Run a Gradle task, capture console to a timestamped log, then delegate parsing to the sub-agent and reply briefly.
usage: "/build-and-summarize <gradle-args...>"
---
# build-and-summarize

**Task:** Build with Gradle (plain console, info level), capture output to `build/logs/`, then have `gradle-log-analyst` parse the log and write:
- `build/reports/claude/gradle-summary.md`
- `build/reports/claude/gradle-summary.json`

Make sure to use the JAVA_HOME environment variable is set appropriately.
Runs `./gradlew` with full output captured to a timestamped log, shows minimal live progress (task starts + final build/test summary), then asks the `gradle-logs-analyst` agent to produce structured artifacts from the log.

## Usage
```bash
set -euo pipefail
mkdir -p build/logs build/reports/claude
STAMP="$(date +%Y%m%d-%H%M%S)"

# Default to 'build' if no args were given
ARGS=("$@")
if [ "${#ARGS[@]}" -eq 0 ]; then
ARGS=(build)
fi

# Make a filename-friendly label (first arg only)
LABEL="$(echo "${ARGS[0]}" | tr '/:' '__')"
LOG="build/logs/${STAMP}-${LABEL}.log"

echo "Running: ./gradlew ${ARGS[*]} -i --console=plain"
# Capture both stdout and stderr to the log while streaming to terminal
(./gradlew "${ARGS[@]}" -i --console=plain 2>&1 | tee "$LOG") || true

# Delegate parsing to the sub-agent
echo "Delegating to gradle-logs-analyst agent..."
claude "Act as the gradle-logs-analyst agent to parse the build log at: $LOG. Generate the required gradle summary artifacts as specified in the gradle-logs-analyst agent definition."
./.claude/commands/build-and-summarize [<gradle-args>...]
3 changes: 2 additions & 1 deletion .claude/settings.local.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@
"Bash(grep:*)",
"WebFetch(domain:github.com)",
"WebFetch(domain:raw.githubusercontent.com)",
"WebFetch(domain:raw.githubusercontent.com)"
"WebFetch(domain:raw.githubusercontent.com)",
"Bash(./.claude/commands/build-and-summarize:*)"
],
"deny": [],
"ask": []
Expand Down
82 changes: 59 additions & 23 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ You are the **Main Orchestrator** for this repository.
“Use `gradle-log-analyst` to parse LOG_PATH; write the two reports; reply with only a 3–6 line status and the two relative file paths.”

### Shortcuts I Expect
- `/build-and-summarize <gradle-task...>` to do everything in one step.
- `./gradlew <gradle-task...>` to do everything in one step.
- If I just say “build assembleDebugJar”, interpret that as the shortcut above.

## Build Commands
Expand All @@ -50,74 +50,74 @@ Never use 'gradle' or 'gradlew' directly. Instead, use the '/build-and-summarize
### Main Build Tasks
```bash
# Build release version (primary artifact)
/build-and-summarize buildRelease
./gradlew buildRelease

# Build all configurations
/build-and-summarize assembleAll
./gradlew assembleAll

# Clean build
/build-and-summarize clean
./gradlew clean
```

### Development Builds
```bash
# Debug build with symbols
/build-and-summarize buildDebug
./gradlew buildDebug

# ASan build (if available)
/build-and-summarize buildAsan
./gradlew buildAsan

# TSan build (if available)
/build-and-summarize buildTsan
./gradlew buildTsan
```

### Testing
```bash
# Run specific test configurations
/build-and-summarize testRelease
/build-and-summarize testDebug
/build-and-summarize testAsan
/build-and-summarize testTsan
./gradlew testRelease
./gradlew testDebug
./gradlew testAsan
./gradlew testTsan

# Run C++ unit tests only
/build-and-summarize gtestDebug
/build-and-summarize gtestRelease
./gradlew gtestDebug
./gradlew gtestRelease

# Cross-JDK testing
JAVA_TEST_HOME=/path/to/test/jdk /build-and-summarize testDebug
JAVA_TEST_HOME=/path/to/test/jdk ./gradlew testDebug
```

### Build Options
```bash
# Skip native compilation
/build-and-summarize buildDebug -Pskip-native
./gradlew buildDebug -Pskip-native

# Skip all tests
/build-and-summarize buildDebug -Pskip-tests
./gradlew buildDebug -Pskip-tests

# Skip C++ tests
/build-and-summarize buildDebug -Pskip-gtest
./gradlew buildDebug -Pskip-gtest

# Keep JFR recordings after tests
/build-and-summarize testDebug -PkeepJFRs
./gradlew testDebug -PkeepJFRs

# Skip debug symbol extraction
/build-and-summarize buildRelease -Pskip-debug-extraction=true
./gradlew buildRelease -Pskip-debug-extraction=true
```

### Code Quality
```bash
# Format code
/build-and-summarize spotlessApply
./gradlew spotlessApply

# Static analysis
/build-and-summarize scanBuild
./gradlew scanBuild

# Run stress tests
/build-and-summarize :ddprof-stresstest:runStressTests
./gradlew :ddprof-stresstest:runStressTests

# Run benchmarks
/build-and-summarize runBenchmarks
./gradlew runBenchmarks
```

## Architecture
Expand Down Expand Up @@ -338,3 +338,39 @@ With separate debug symbol packages for production debugging support.

- Run tests with 'testdebug' gradle task
- Use at most Java 21 to build and run tests

## Agentic Work

- Never run `./gradlew` directly.
- Always invoke the wrapper command: `./.claude/commands/build-and-summarize`.
- Pass through all arguments exactly as you would to `./gradlew`.
- Examples:
- Instead of:
```bash
./gradlew build
```
use:
```bash
./.claude/commands/build-and-summarize build
```
- Instead of:
```bash
./gradlew :prof-utils:test --tests "UpscaledMethodSampleEventSinkTest"
```
use:
```bash
./.claude/commands/build-and-summarize :prof-utils:test --tests "UpscaledMethodSampleEventSinkTest"
```

- This ensures the full build log is captured to a file and only a summary is shown in the main session.

## Ground rules
- Never replace the code you work on with stubs
- Never 'fix' the tests by testing constants against constants
- Never claim success until all affected tests are passing
- Always provide javadoc for public classes and methods
- Provide javadoc for non-trivial private and package private code
- Always provide comprehensive tests for new functionality
- Always provide tests for bug fixes - test fails before the fix, passes after the fix
- All code needs to strive to be lean in terms of resources consumption and easy to follow -
do not shy away from factoring out self containing code to shorter functions with explicit name
54 changes: 54 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -348,6 +348,60 @@ The project includes JMH-based stress tests:
- ASan: `libasan`
- TSan: `libtsan`

## Architectural Tidbits

This section documents important architectural decisions and enhancements made to the profiler core.

### Critical Section Management (2025)

Introduced race-free critical section management using atomic compare-and-swap operations instead of expensive signal blocking syscalls:

- **`CriticalSection` class**: Thread-local atomic flag-based protection against signal handler reentrancy
- **Lock-free design**: Uses `compare_exchange_strong` for atomic claiming of critical sections
- **Signal handler safety**: Eliminates race conditions between signal handlers and normal code execution
- **Performance improvement**: Avoids costly `sigprocmask`/`pthread_sigmask` syscalls in hot paths

**Key files**: `criticalSection.h`, `criticalSection.cpp`

### Triple-Buffered Call Trace Storage (2025)

Enhanced the call trace storage system from double-buffered to triple-buffered architecture with hazard pointer-based memory reclamation:

- **Triple buffering**: Active, standby, and cleanup storage rotation for smoother transitions
- **Hazard pointer system**: Per-instance thread-safe memory reclamation without global locks
- **ABA protection**: Generation counter prevents race conditions during table swaps
- **Instance-based trace IDs**: 64-bit IDs combining instance ID and slot for collision-free trace management
- **Lock-free hot paths**: Atomic operations minimize contention during profiling events

**Key changes**:
- Replaced `SpinLock` with atomic pointers and hazard pointer system
- Added generation counter for safe table swapping
- Enhanced liveness preservation across storage rotations
- Improved thread safety for high-frequency profiling scenarios

**Key files**: `callTraceStorage.h`, `callTraceStorage.cpp`, `callTraceHashTable.h`, `callTraceHashTable.cpp`

### Enhanced Testing Infrastructure (2025)

Comprehensive testing improvements for better debugging and stress testing:

- **GTest crash handler**: Detailed crash reporting with backtraces and register state for native code failures
- **Stress testing framework**: Multi-threaded stress tests for call trace storage under high contention
- **Platform-specific debugging**: macOS and Linux register state capture in crash handlers
- **Async-signal-safe reporting**: Crash handlers use only signal-safe functions for reliable diagnostics

**Key files**: `gtest_crash_handler.h`, `stress_callTraceStorage.cpp`

### TLS Priming Enhancements (2025)

Improved thread-local storage initialization to prevent race conditions:

- **Solid TLS priming**: Enhanced thread-local variable initialization timing
- **Signal handler compatibility**: Ensures TLS is fully initialized before signal handler access
- **Cross-platform consistency**: Unified TLS handling across Linux and macOS platforms

These architectural improvements focus on eliminating race conditions, improving performance in high-throughput scenarios, and providing better debugging capabilities for the native profiling engine.

## Contributing
1. Fork the repository
2. Create a feature branch
Expand Down
Loading
Loading