Skip to content

src: enrich OTLP metrics resource attributes#454

Open
santigimeno wants to merge 1 commit intonode-v24.x-nsolid-v6.xfrom
santi/info_in_attrs_standard
Open

src: enrich OTLP metrics resource attributes#454
santigimeno wants to merge 1 commit intonode-v24.x-nsolid-v6.xfrom
santi/info_in_attrs_standard

Conversation

@santigimeno
Copy link
Copy Markdown
Member

@santigimeno santigimeno commented Apr 23, 2026

Add process metadata from nsolid.info() to OTLP metric resources while leaving logs and traces on the common resource. Metrics now include different attributes derived from process info. This will ease our internal metrics handling.

Keep resource access thread-safe by returning shared resource snapshots guarded by nsuv::ns_mutex. This avoids use-after-free if UpdateResource() replaces a resource while another thread is exporting telemetry.

Update gRPC and OTLP metrics tests to validate the expanded resource attributes.

Summary by CodeRabbit

  • New Features

    • Metrics now include a dedicated metrics resource and richer telemetry metadata (host, OS/arch, process, runtime, CPU, service and deployment details).
    • Tests and agent messaging now exchange runtime info for richer validation.
  • Bug Fixes

    • Safer, concurrency-aware resource management and more robust resource updates; process metadata preserves service name and uses the correct process title key.
  • Tests

    • Expanded, stricter tests: typed resource attributes are validated comprehensively and unexpected keys are rejected.

@santigimeno santigimeno requested a review from RafaelGSS April 23, 2026 15:44
@santigimeno santigimeno self-assigned this Apr 23, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 23, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cacf77c9-9177-44d6-afbf-6aaad5969832

📥 Commits

Reviewing files that changed from the base of the PR and between 8747a8c and 26e038d.

📒 Files selected for processing (4)
  • agents/grpc/src/grpc_agent.cc
  • agents/otlp/src/otlp_common.cc
  • agents/otlp/src/otlp_common.h
  • agents/otlp/src/otlp_metrics.cc
🚧 Files skipped from review as they are similar to previous changes (2)
  • agents/otlp/src/otlp_metrics.cc
  • agents/grpc/src/grpc_agent.cc

Walkthrough

Switches OTLP resource ownership to thread-safe shared pointers, adds a metrics-specific Resource via GetMetricsResource(), updates metric export paths to attach that resource, and extends test harness/helpers to fetch and validate expanded process/host/runtime metadata.

Changes

Cohort / File(s) Summary
Core OTLP Resource Management
agents/otlp/src/otlp_common.h, agents/otlp/src/otlp_common.cc
Move resource state to std::shared_ptr with mutex protection; change GetResource()/UpdateResource(...) to return std::shared_ptr<Resource>; add GetMetricsResource() which builds/merges metrics-specific attributes (normalized host/os/arch, process start ISO8601, runtime info); update callers to hold shared_ptr locally.
gRPC Agent Integration
agents/grpc/src/grpc_agent.cc
Replace GetResource() usage in metrics export paths with GetMetricsResource(), capture shared_ptr locally and set data.resource_ = resource.get() before populating/exporting metrics.
Metrics Export
agents/otlp/src/otlp_metrics.cc
Use GetMetricsResource() for ResourceMetrics payloads in process/thread metrics paths and assign .get() to data.resource_ prior to scope/metric population and export.
Test gRPC Client & Helpers
test/common/nsolid-grpc-agent/client.js, test/common/nsolid-grpc-agent/index.js
Add support for 'info' message and TestClient.info() helper; update checkResource() signature to accept info and expand/normalize expected resource attributes (host/os/arch, runtime, cpu, process start, tags) and tighten key/type checks.
Resource Validation Tests
test/agents/test-grpc-metrics.mjs, test/agents/test-otlp-grpc-metrics.mjs, test/agents/test-otlp-metrics.mjs
Child now sends nsolid.info(); parent tests fetch info and validate an expanded, typed OTLP resource-attribute map, rejecting unknown keys and asserting correct OTLP value fields (stringValue/intValue).

Sequence Diagram(s)

sequenceDiagram
    participant Agent as gRPC Agent
    participant OTLP as otlp_common
    participant Exporter as OTLP Exporter
    participant Test as Test Harness

    Agent->>OTLP: GetMetricsResource()
    OTLP-->>Agent: std::shared_ptr<Resource> (metrics_resource)
    Agent->>Exporter: Populate ResourceMetrics (data.resource_ = resource.get())
    Exporter->>Exporter: serialize & send metrics payload (with Resource)
    Test->>Agent: request/observe metrics event
    Agent-->>Test: metrics event (validated against metrics_resource)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • RafaelGSS

Poem

🐰 I hopped through pointers, shared and bright,
Swapped raw for safe, and set metrics right.
Hosts and runtimes now wear tidy tags,
Tests fetched info and chased down nags.
The rabbit dances — exports gleam tonight! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 18.42% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'src: enrich OTLP metrics resource attributes' directly summarizes the main change: enriching OTLP metrics with additional resource attributes derived from process metadata.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch santi/info_in_attrs_standard

Review rate limit: 4/5 reviews remaining, refill in 12 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agents/otlp/src/otlp_common.cc`:
- Around line 89-92: The static nsuv::ns_mutex resource_mutex_g is
default-constructed and never initialized, so lock attempts in GetResource(),
GetMetricsResource(), and UpdateResource() can operate on an uninitialized
mutex; fix by initializing resource_mutex_g before first use—either call
resource_mutex_g.init() from a module/init function that runs before any of
those functions, or replace the declaration with the parameterized/explicit
nsuv::ns_mutex constructor that performs initialization (ensuring the mutex is
valid for all scoped_lock usages).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3be3e097-4308-44cd-b971-4805d219e7c8

📥 Commits

Reviewing files that changed from the base of the PR and between 1fec452 and a994743.

📒 Files selected for processing (9)
  • agents/grpc/src/grpc_agent.cc
  • agents/otlp/src/otlp_common.cc
  • agents/otlp/src/otlp_common.h
  • agents/otlp/src/otlp_metrics.cc
  • test/agents/test-grpc-metrics.mjs
  • test/agents/test-otlp-grpc-metrics.mjs
  • test/agents/test-otlp-metrics.mjs
  • test/common/nsolid-grpc-agent/client.js
  • test/common/nsolid-grpc-agent/index.js

Comment thread agents/otlp/src/otlp_common.cc
@santigimeno santigimeno force-pushed the santi/info_in_attrs_standard branch from a994743 to 0284332 Compare April 23, 2026 19:03
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agents/otlp/src/otlp_common.cc`:
- Around line 333-343: EnsureMetricsResourceInitializedLocked currently builds a
new Resource from GetMetadataResourceAttributes(info) which can inject
"unknown_service" and diverge from the already-configured resource_g; modify
EnsureMetricsResourceInitializedLocked so after computing attrs =
GetMetadataResourceAttributes(info) you detect if attrs lacks a service.name (or
is "unknown_service") and if resource_g has a service.name, copy that value into
attrs before calling Resource::Create; then construct new_res and merge as
before (references: EnsureMetricsResourceInitializedLocked,
GetMetadataResourceAttributes, Resource::Create, resource_g,
metrics_resource_g).

In `@test/common/nsolid-grpc-agent/index.js`:
- Around line 505-518: The info() helper currently uses
this.#child.once('message') which can consume unrelated IPC messages and hang;
replace its implementation to mirror
config()/enableAssets()/disableAssets()/enableTraces()/disableTraces() by
calling the existing `#sendAndWait` helper: if this.#child exists, invoke
this.#sendAndWait({ type: 'info' }, 'info') and return the resolved info payload
(the helper enforces the 30s timeout and message-type filtering); if no child,
return null. Ensure you reference the info() method and reuse `#sendAndWait` so
the listener filters by message.type and avoids hanging.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 922a703a-85bb-4457-b666-3d65263809df

📥 Commits

Reviewing files that changed from the base of the PR and between a994743 and 0284332.

📒 Files selected for processing (9)
  • agents/grpc/src/grpc_agent.cc
  • agents/otlp/src/otlp_common.cc
  • agents/otlp/src/otlp_common.h
  • agents/otlp/src/otlp_metrics.cc
  • test/agents/test-grpc-metrics.mjs
  • test/agents/test-otlp-grpc-metrics.mjs
  • test/agents/test-otlp-metrics.mjs
  • test/common/nsolid-grpc-agent/client.js
  • test/common/nsolid-grpc-agent/index.js
✅ Files skipped from review due to trivial changes (1)
  • test/common/nsolid-grpc-agent/client.js
🚧 Files skipped from review as they are similar to previous changes (4)
  • agents/otlp/src/otlp_metrics.cc
  • agents/grpc/src/grpc_agent.cc
  • test/agents/test-otlp-grpc-metrics.mjs
  • test/agents/test-otlp-metrics.mjs

Comment thread agents/otlp/src/otlp_common.cc Outdated
Comment thread test/common/nsolid-grpc-agent/index.js
coderabbitai[bot]
coderabbitai Bot previously approved these changes Apr 23, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/common/nsolid-grpc-agent/index.js (1)

250-264: once('message') may hang if an unrelated message arrives first.

This method uses once('message') with a type check inside the callback. If a different message type arrives before the expected 'info' response, the listener fires once, the type check fails (so resolve isn't called), but the listener is already removed—causing the promise to hang forever.

Other methods in this class (e.g., cpuProfile, heapProfile) correctly use on('message') with a named handler that's explicitly removed when the expected message arrives. Consider aligning this method with that pattern.

♻️ Suggested fix
   async info(agentId) {
     return new Promise((resolve) => {
       if (this.#server) {
         const requestId = randomUUID();
         this.#server.send({ type: 'info', agentId, requestId });
-        this.#server.once('message', (msg) => {
+        const msgListener = (msg) => {
           if (msg.type === 'info') {
+            this.#server.off('message', msgListener);
             resolve({ requestId, data: msg.data });
           }
-        });
+        };
+        this.#server.on('message', msgListener);
       } else {
         resolve(null);
       }
     });
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/common/nsolid-grpc-agent/index.js` around lines 250 - 264, The
info(agentId) method currently uses this.#server.once('message') which can
consume an unrelated message and never resolve; change it to register a named
handler with this.#server.on('message'), check for msg.type === 'info' and
msg.requestId matching the sent requestId, then call resolve({ requestId, data:
msg.data }) and remove that handler via this.#server.off('message', handler);
also ensure the handler only runs when this.#server exists and fall back to
resolving null like the other methods (mirror the pattern used in
cpuProfile/heapProfile).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agents/otlp/src/otlp_common.cc`:
- Around line 155-158: The test expects process.pid as a string, but
attrs.SetAttribute(kProcessPid, static_cast<int64_t>(...)) stores it as an
integer; change the code handling info.find("pid") to convert the pid to a
string and call attrs.SetAttribute(kProcessPid, std::string(...)) (mirror how
cpuCores is handled) so kProcessPid is populated with the stringified pid
instead of an int.

---

Nitpick comments:
In `@test/common/nsolid-grpc-agent/index.js`:
- Around line 250-264: The info(agentId) method currently uses
this.#server.once('message') which can consume an unrelated message and never
resolve; change it to register a named handler with this.#server.on('message'),
check for msg.type === 'info' and msg.requestId matching the sent requestId,
then call resolve({ requestId, data: msg.data }) and remove that handler via
this.#server.off('message', handler); also ensure the handler only runs when
this.#server exists and fall back to resolving null like the other methods
(mirror the pattern used in cpuProfile/heapProfile).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 10ad6dfa-6227-40d6-8918-4e3c796a9b4f

📥 Commits

Reviewing files that changed from the base of the PR and between 4ea198a and 8747a8c.

📒 Files selected for processing (9)
  • agents/grpc/src/grpc_agent.cc
  • agents/otlp/src/otlp_common.cc
  • agents/otlp/src/otlp_common.h
  • agents/otlp/src/otlp_metrics.cc
  • test/agents/test-grpc-metrics.mjs
  • test/agents/test-otlp-grpc-metrics.mjs
  • test/agents/test-otlp-metrics.mjs
  • test/common/nsolid-grpc-agent/client.js
  • test/common/nsolid-grpc-agent/index.js
✅ Files skipped from review due to trivial changes (1)
  • agents/grpc/src/grpc_agent.cc
🚧 Files skipped from review as they are similar to previous changes (4)
  • agents/otlp/src/otlp_metrics.cc
  • test/agents/test-otlp-grpc-metrics.mjs
  • test/agents/test-otlp-metrics.mjs
  • test/agents/test-grpc-metrics.mjs

Comment on lines +155 to +158
it = info.find("pid");
if (it != info.end() && it->is_number_unsigned()) {
attrs.SetAttribute(kProcessPid, static_cast<int64_t>(it->get<uint32_t>()));
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Type mismatch: process.pid is set as int64_t but tests expect a string.

The attribute is set as an integer here, but in test/common/nsolid-grpc-agent/index.js (line 75), the expected value is set as a template string `${info.pid}`. The test comparison at line 98-99 uses assert.strictEqual(actual, expected) which will fail because JavaScript's strict equality distinguishes between 12345 (number) and "12345" (string).

Either convert the pid to string here (consistent with cpuCores on line 205), or update the test expectation to use the integer value.

🛠️ Option A: Convert to string in C++ (consistent with cpuCores)
   it = info.find("pid");
   if (it != info.end() && it->is_number_unsigned()) {
-    attrs.SetAttribute(kProcessPid, static_cast<int64_t>(it->get<uint32_t>()));
+    attrs.SetAttribute(kProcessPid, std::to_string(it->get<uint32_t>()));
   }
🛠️ Option B: Update test to expect integer
-    expectedAttributes['process.pid'] = `${info.pid}`;
+    expectedAttributes['process.pid'] = info.pid;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
it = info.find("pid");
if (it != info.end() && it->is_number_unsigned()) {
attrs.SetAttribute(kProcessPid, static_cast<int64_t>(it->get<uint32_t>()));
}
it = info.find("pid");
if (it != info.end() && it->is_number_unsigned()) {
attrs.SetAttribute(kProcessPid, std::to_string(it->get<uint32_t>()));
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agents/otlp/src/otlp_common.cc` around lines 155 - 158, The test expects
process.pid as a string, but attrs.SetAttribute(kProcessPid,
static_cast<int64_t>(...)) stores it as an integer; change the code handling
info.find("pid") to convert the pid to a string and call
attrs.SetAttribute(kProcessPid, std::string(...)) (mirror how cpuCores is
handled) so kProcessPid is populated with the stringified pid instead of an int.

Add process metadata from nsolid.info() to OTLP metric resources while
leaving logs and traces on the common resource. Metrics now include
different attributes derived from process `info`. This will ease our
internal metrics handling.

Keep resource access thread-safe by returning shared resource snapshots
guarded by nsuv::ns_mutex. This avoids use-after-free if UpdateResource()
replaces a resource while another thread is exporting telemetry.

Update gRPC and OTLP metrics tests to validate the expanded resource
attributes.
@santigimeno santigimeno force-pushed the santi/info_in_attrs_standard branch from 8747a8c to 26e038d Compare April 29, 2026 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant