Proposal: Lightweight observability and Prometheus-compatible metrics service

## Problem
Currently, there is limited visibility into the performance and health of NemoClaw operations. Specifically, tracking blueprint execution latency, API validation success rates, and sandbox lifecycle operations (such as `launch`) requires manual log parsing or external wrappers. This lack of telemetry makes it difficult to monitor NemoClaw in automated CI/CD pipelines or production-like environments where performance regressions or intermittent API failures need to be surfaced programmatically.

## Proposed Solution
Introduce an optional, lightweight metrics service built directly into the plugin. When enabled via the `NEMOCLAW_METRICS_ENABLED` environment variable, NemoClaw will:
- Maintain an internal registry of counters and histograms for key operations.
- Start a minimal HTTP server (defaulting to port 9090) to export these metrics in Prometheus text format at a `/metrics` endpoint.
- Instrument critical paths, including `execBlueprint` (renamed to `blueprint_execution` for clarity) and API key validation.

## Design Goals
- **Zero Overhead**: When disabled (default), the metrics logic is bypassed to ensure no performance impact for standard CLI users.
- **Zero Dependencies**: The implementation uses native `node:http` and `process.hrtime` to maintain a minimal footprint without adding to the dependency tree.
- **Prometheus Compatibility**: Adheres to standard exposition formats for immediate integration with existing monitoring stacks.

## Open Questions
- Does this built-in approach align with the project's long-term roadmap, or is there a preference for moving toward OpenTelemetry despite the additional dependency weight?
- Are there specific sandbox lifecycle events (e.g., `eject`, `migrate`) that should be prioritized for initial instrumentation?

I have opened **PR #230** with a working implementation of this proposal. I look forward to your feedback on the architectural direction.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Lightweight observability and Prometheus-compatible metrics service #233

Problem

Proposed Solution

Design Goals

Open Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal: Lightweight observability and Prometheus-compatible metrics service #233

Description

Problem

Proposed Solution

Design Goals

Open Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions