Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
c52da1c
Add remote symbolication support with build-id and PC offset
jbachorik Jan 7, 2026
bd80f4c
Fix RemoteSymbolicationTest assertions to match JMC stack trace format
jbachorik Jan 7, 2026
04a7e9c
Add debug output to RemoteSymbolicationTest to diagnose failure
jbachorik Jan 7, 2026
161ad44
Extend jdk.NativeLibrary with buildId and loadBias fields
jbachorik Jan 7, 2026
55fbce3
Add test library with build-id for remote symbolication testing
jbachorik Jan 7, 2026
0f3951b
Add 'vm' cstack mode to RemoteSymbolicationTest
jbachorik Jan 7, 2026
dde78ee
Refactor build-id extraction to follow project architecture
jbachorik Jan 7, 2026
a98db7a
Fix compilation errors in remote symbolication test
jbachorik Jan 7, 2026
8604a97
Add missing cstdint include for uint8_t type
jbachorik Jan 7, 2026
0fa6f7d
Fix compilation errors and enhance remote symbolication test
jbachorik Jan 7, 2026
6924e55
Fix jdk.NativeLibrary events not being emitted
jbachorik Jan 7, 2026
ff8476f
Deduplicate native_libs collections in Profiler and Libraries
jbachorik Jan 7, 2026
fd63395
Fix remote symbolication by deferring symbol resolution
jbachorik Jan 7, 2026
ef0b7e1
Increase native workload in RemoteSymbolicationTest
jbachorik Jan 7, 2026
d63365e
Fix RemoteSymbolicationTest frame detection
jbachorik Jan 7, 2026
1b7b1cc
Add debug logging for remote symbolication troubleshooting
jbachorik Jan 7, 2026
e3d9a43
Add missing common.h include for TEST_LOG macro
jbachorik Jan 7, 2026
8d758bf
Apply remote symbolication to VM/VMX stack walkers
jbachorik Jan 8, 2026
1215dc1
Implement signal-safe pre-allocated pool for RemoteFrameInfo
jbachorik Jan 13, 2026
0e4235c
Fix remote symbolication for VM/VMX stack walkers
jbachorik Jan 13, 2026
45d0122
Fix the 'remotesym' arg name
jbachorik Jan 13, 2026
3fa35e2
Update REMOTE_SYMBOLICATION.md with current implementation
jbachorik Jan 13, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -402,6 +402,25 @@ Improved thread-local storage initialization to prevent race conditions:

These architectural improvements focus on eliminating race conditions, improving performance in high-throughput scenarios, and providing better debugging capabilities for the native profiling engine.

### Remote Symbolication Support (2025)

Added support for remote symbolication to enable offloading symbol resolution from the agent to backend services:

- **Build-ID extraction**: Automatically extracts GNU build-id from ELF binaries on Linux
- **Raw addressing information**: Stores build-id and PC offset instead of resolved symbol names
- **Remote symbolication mode**: Enable with `remotesym=true` profiler argument
- **JFR integration**: Remote frames serialized with build-id and offset for backend resolution
- **Zero encoding overhead**: Uses dedicated frame type (FRAME_NATIVE_REMOTE) for efficient serialization

**Benefits**:
- Reduces agent overhead by eliminating local symbol resolution
- Enables centralized symbol resolution with better caching
- Supports scenarios where debug symbols are not available locally

**Key files**: `elfBuildId.h`, `elfBuildId.cpp`, `profiler.cpp`, `flightRecorder.cpp`

For detailed documentation, see [doc/REMOTE_SYMBOLICATION.md](doc/REMOTE_SYMBOLICATION.md).

## Contributing
1. Fork the repository
2. Create a feature branch
Expand Down
27 changes: 21 additions & 6 deletions ddprof-lib/src/main/cpp/arguments.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,10 @@ static const Multiplier UNIVERSAL[] = {
// samples
// generations - track surviving generations
// lightweight[=BOOL] - enable lightweight profiling - events without
// stacktraces (default: true) jfr - dump events in Java
// stacktraces (default: true)
// remotesym[=BOOL] - enable remote symbolication for native frames
// (stores build-id and PC offset instead of symbol names)
Comment on lines +92 to +93
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation comment indicates "remotesym[=BOOL]" but the implementation doesn't follow the typical boolean argument pattern. It should either accept standard boolean values (true/false, yes/no, 1/0) or the comment should clarify that only 'y' and 't' are accepted for true. Consider aligning the implementation with the documented interface or updating the documentation to match actual behavior.

Suggested change
// remotesym[=BOOL] - enable remote symbolication for native frames
// (stores build-id and PC offset instead of symbol names)
// remotesym[=FLAG] - enable remote symbolication for native frames when
// FLAG is 'y' or 't' (stores build-id and PC offset instead
// of symbol names; any other value disables remote
// symbolication)

Copilot uses AI. Check for mistakes.
// jfr - dump events in Java
// Flight Recorder format interval=N - sampling interval in ns
// (default: 10'000'000, i.e. 10 ms) jstackdepth=N - maximum Java stack
// depth (default: 2048) safemode=BITS - disable stack recovery
Expand Down Expand Up @@ -339,18 +342,30 @@ Error Arguments::parse(const char *args) {
_enable_method_cleanup = true;
}

CASE("wallsampler")
CASE("remotesym")
if (value != NULL) {
switch (value[0]) {
case 'j':
_wallclock_sampler = JVMTI;
case 'y': // yes
case 't': // true
_remote_symbolication = true;
break;
case 'a':
default:
_wallclock_sampler = ASGCT;
_remote_symbolication = false;
}
}
Comment on lines +345 to 355
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument parsing for "remotesym" uses a simple switch statement that only checks the first character. This means "remotesym=n" or "remotesym=no" or "remotesym=0" would all be treated as false (falling to default), but "remotesym=yes" would work while "remotesym=yikes" would also enable it. Consider using more robust parsing like the existing parseBool function pattern used elsewhere in the codebase for consistency.

Copilot uses AI. Check for mistakes.

CASE("wallsampler")
if (value != NULL) {
switch (value[0]) {
case 'j':
_wallclock_sampler = JVMTI;
break;
case 'a':
default:
_wallclock_sampler = ASGCT;
}
}

DEFAULT()
if (_unknown_arg == NULL)
_unknown_arg = arg;
Expand Down
4 changes: 3 additions & 1 deletion ddprof-lib/src/main/cpp/arguments.h
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,7 @@ class Arguments {
std::vector<std::string> _context_attributes;
bool _lightweight;
bool _enable_method_cleanup;
bool _remote_symbolication; // Enable remote symbolication for native frames

Arguments(bool persistent = false)
: _buf(NULL),
Expand Down Expand Up @@ -221,7 +222,8 @@ class Arguments {
_context_attributes({}),
_wallclock_sampler(ASGCT),
_lightweight(false),
_enable_method_cleanup(true) {}
_enable_method_cleanup(true),
_remote_symbolication(false) {}

~Arguments();

Expand Down
66 changes: 63 additions & 3 deletions ddprof-lib/src/main/cpp/codeCache.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,11 @@ CodeCache::CodeCache(const char *name, short lib_index,
_plt_size = 0;
_debug_symbols = false;

// Initialize build-id fields
_build_id = nullptr;
_build_id_len = 0;
_load_bias = 0;

memset(_imports, 0, sizeof(_imports));
_imports_patchable = imports_patchable;

Expand All @@ -54,10 +59,27 @@ CodeCache::CodeCache(const CodeCache &other) {
_min_address = other._min_address;
_max_address = other._max_address;
_text_base = other._text_base;
_image_base = other._image_base;

_imports_patchable = other._imports_patchable;
_plt_offset = other._plt_offset;
_plt_size = other._plt_size;
_debug_symbols = other._debug_symbols;

// Copy build-id information
_build_id_len = other._build_id_len;
if (other._build_id != nullptr && other._build_id_len > 0) {
size_t hex_str_len = strlen(other._build_id);
_build_id = static_cast<char*>(malloc(hex_str_len + 1));
if (_build_id != nullptr) {
strcpy(_build_id, other._build_id);
}
} else {
_build_id = nullptr;
}
_load_bias = other._load_bias;

memset(_imports, 0, sizeof(_imports));
_imports_patchable = other._imports_patchable;

_dwarf_table_length = other._dwarf_table_length;
_dwarf_table = new FrameDesc[_dwarf_table_length];
Expand All @@ -77,17 +99,34 @@ CodeCache &CodeCache::operator=(const CodeCache &other) {
delete _name;
delete _dwarf_table;
delete _blobs;
free(_build_id); // Free existing build-id

_name = NativeFunc::create(other._name, -1);
_lib_index = other._lib_index;
_min_address = other._min_address;
_max_address = other._max_address;
_text_base = other._text_base;

_imports_patchable = other._imports_patchable;
_image_base = other._image_base;

_plt_offset = other._plt_offset;
_plt_size = other._plt_size;
_debug_symbols = other._debug_symbols;

// Copy build-id information
_build_id_len = other._build_id_len;
if (other._build_id != nullptr && other._build_id_len > 0) {
size_t hex_str_len = strlen(other._build_id);
_build_id = static_cast<char*>(malloc(hex_str_len + 1));
if (_build_id != nullptr) {
strcpy(_build_id, other._build_id);
}
} else {
_build_id = nullptr;
}
_load_bias = other._load_bias;

memset(_imports, 0, sizeof(_imports));
_imports_patchable = other._imports_patchable;

_dwarf_table_length = other._dwarf_table_length;
_dwarf_table = new FrameDesc[_dwarf_table_length];
Expand All @@ -110,6 +149,7 @@ CodeCache::~CodeCache() {
NativeFunc::destroy(_name);
delete[] _blobs;
delete _dwarf_table;
free(_build_id); // Free build-id memory
}

void CodeCache::expand() {
Expand Down Expand Up @@ -387,3 +427,23 @@ FrameDesc CodeCache::findFrameDesc(const void *pc) {
return FrameDesc::default_frame;
}
}

void CodeCache::setBuildId(const char* build_id, size_t build_id_len) {
// Free existing build-id if any
free(_build_id);
_build_id = nullptr;
_build_id_len = 0;

if (build_id != nullptr && build_id_len > 0) {
// build_id is a hex string, allocate based on actual string length
size_t hex_str_len = strlen(build_id);
_build_id = static_cast<char*>(malloc(hex_str_len + 1));

if (_build_id != nullptr) {
// Copy the hex string
strcpy(_build_id, build_id);
// Store the original byte length (not hex string length)
_build_id_len = build_id_len;
}
}
}
22 changes: 20 additions & 2 deletions ddprof-lib/src/main/cpp/codeCache.h
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,11 @@ class CodeCache {
unsigned int _plt_offset;
unsigned int _plt_size;

// Build-ID and load bias for remote symbolication
char *_build_id; // GNU build-id (hex string, null if not available)
size_t _build_id_len; // Build-id length in bytes (raw, not hex string length)
uintptr_t _load_bias; // Load bias (image_base - file_base address)

void **_imports[NUM_IMPORTS][NUM_IMPORT_TYPES];
bool _imports_patchable;
bool _debug_symbols;
Expand Down Expand Up @@ -169,6 +174,19 @@ class CodeCache {

void setDebugSymbols(bool debug_symbols) { _debug_symbols = debug_symbols; }

// Build-ID and remote symbolication support
const char* buildId() const { return _build_id; }
size_t buildIdLen() const { return _build_id_len; }
bool hasBuildId() const { return _build_id != nullptr; }
uintptr_t loadBias() const { return _load_bias; }
short libIndex() const { return _lib_index; }

// Sets the build-id (hex string) and stores the original byte length
// build_id: null-terminated hex string (e.g., "abc123..." for 40-char string)
// build_id_len: original byte length before hex conversion (e.g., 20 bytes)
void setBuildId(const char* build_id, size_t build_id_len);
void setLoadBias(uintptr_t load_bias) { _load_bias = load_bias; }

void add(const void *start, int length, const char *name,
bool update_bounds = false);
void updateBounds(const void *start, const void *end);
Expand Down Expand Up @@ -225,7 +243,7 @@ class CodeCacheArray {
memset(_libs, 0, MAX_NATIVE_LIBS * sizeof(CodeCache *));
}

CodeCache *operator[](int index) { return _libs[index]; }
CodeCache *operator[](int index) const { return __atomic_load_n(&_libs[index], __ATOMIC_ACQUIRE); }

int count() const { return __atomic_load_n(&_count, __ATOMIC_RELAXED); }

Expand All @@ -247,7 +265,7 @@ class CodeCacheArray {
return lib;
}

size_t memoryUsage() {
size_t memoryUsage() const {
return __atomic_load_n(&_used_memory, __ATOMIC_RELAXED);
}
};
Expand Down
37 changes: 32 additions & 5 deletions ddprof-lib/src/main/cpp/flightRecorder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,25 @@ void Lookup::fillNativeMethodInfo(MethodInfo *mi, const char *name,
}
}

void Lookup::fillRemoteFrameInfo(MethodInfo *mi, const RemoteFrameInfo *rfi) {
// Store build-id in the class name field
mi->_class = _classes->lookup(rfi->build_id);

// Store PC offset in hex format in the signature field
char offset_hex[32];
snprintf(offset_hex, sizeof(offset_hex), "0x%lx", rfi->pc_offset);
Comment on lines +122 to +123
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The snprintf call uses format string "0x%lx" which assumes uintptr_t is equivalent to unsigned long. On some platforms (e.g., Windows x64), uintptr_t may be unsigned long long, not unsigned long, which could cause format string warnings or incorrect output. Use the portable PRIxPTR macro from inttypes.h instead.

Copilot uses AI. Check for mistakes.
mi->_sig = _symbols.lookup(offset_hex);

// Use same modifiers as regular native frames (0x100 = ACC_NATIVE for consistency)
mi->_modifiers = 0x100;
// Use FRAME_NATIVE_REMOTE type to indicate remote symbolication
mi->_type = FRAME_NATIVE_REMOTE;
mi->_line_number_table = nullptr;

// Method name indicates need for remote symbolication
mi->_name = _symbols.lookup("<remote>");
}

void Lookup::cutArguments(char *func) {
char *p = strrchr(func, ')');
if (p == NULL)
Expand Down Expand Up @@ -322,6 +341,9 @@ MethodInfo *Lookup::resolveMethod(ASGCT_CallFrame &frame) {
const char *name = (const char *)method;
fillNativeMethodInfo(mi, name,
Profiler::instance()->getLibraryName(name));
} else if (frame.bci == BCI_NATIVE_FRAME_REMOTE) {
const RemoteFrameInfo *rfi = (const RemoteFrameInfo *)method;
fillRemoteFrameInfo(mi, rfi);
} else {
fillJavaMethodInfo(mi, method, first_time);
}
Expand Down Expand Up @@ -1036,18 +1058,23 @@ void Recording::writeNativeLibraries(Buffer *buf) {
if (_recorded_lib_count < 0)
return;

Profiler *profiler = Profiler::instance();
CodeCacheArray &native_libs = profiler->_native_libs;
Libraries *libraries = Libraries::instance();
const CodeCacheArray &native_libs = libraries->native_libs();
int native_lib_count = native_libs.count();

for (int i = _recorded_lib_count; i < native_lib_count; i++) {
CodeCache* lib = native_libs[i];

// Emit jdk.NativeLibrary event with extended fields (buildId and loadBias)
flushIfNeeded(buf, RECORDING_BUFFER_LIMIT - MAX_STRING_LENGTH);
int start = buf->skip(5);
buf->putVar64(T_NATIVE_LIBRARY);
buf->putVar64(_start_ticks);
buf->putUtf8(native_libs[i]->name());
buf->putVar64((uintptr_t)native_libs[i]->minAddress());
buf->putVar64((uintptr_t)native_libs[i]->maxAddress());
buf->putUtf8(lib->name());
buf->putVar64((uintptr_t)lib->minAddress());
buf->putVar64((uintptr_t)lib->maxAddress());
buf->putUtf8(lib->hasBuildId() ? lib->buildId() : "");
buf->putVar64((uintptr_t)lib->loadBias());
buf->putVar32(start, buf->offset() - start);
flushIfNeeded(buf);
}
Expand Down
1 change: 1 addition & 0 deletions ddprof-lib/src/main/cpp/flightRecorder.h
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,7 @@ class Lookup {
private:
void fillNativeMethodInfo(MethodInfo *mi, const char *name,
const char *lib_name);
void fillRemoteFrameInfo(MethodInfo *mi, const RemoteFrameInfo *rfi);
void cutArguments(char *func);
void fillJavaMethodInfo(MethodInfo *mi, jmethodID method, bool first_time);
bool has_prefix(const char *str, const char *prefix) const {
Expand Down
1 change: 1 addition & 0 deletions ddprof-lib/src/main/cpp/frame.h
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ enum FrameTypeId {
FRAME_CPP = 4,
FRAME_KERNEL = 5,
FRAME_C1_COMPILED = 6,
FRAME_NATIVE_REMOTE = 7, // Native frame with remote symbolication (build-id + pc-offset)
};

class FrameType {
Expand Down
4 changes: 3 additions & 1 deletion ddprof-lib/src/main/cpp/jfrMetadata.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,9 @@ void JfrMetadata::initialize(
<< field("startTime", T_LONG, "Start Time", F_TIME_TICKS)
<< field("name", T_STRING, "Name")
<< field("baseAddress", T_LONG, "Base Address", F_ADDRESS)
<< field("topAddress", T_LONG, "Top Address", F_ADDRESS))
<< field("topAddress", T_LONG, "Top Address", F_ADDRESS)
<< field("buildId", T_STRING, "GNU Build ID")
<< field("loadBias", T_LONG, "Load Bias", F_ADDRESS))

<< (type("profiler.Log", T_LOG, "Log Message")
<< category("Profiler")
Expand Down
Loading
Loading