feat: Add opt-in metrics collection for chat requests by nazq · Pull Request #102 · graniet/llm

nazq · 2025-12-31T17:01:59Z

Summary

Add ChatMetrics struct for tracking token usage and timing
Add Tracked<S> stream wrapper for metrics-aware streaming
Add MetricsProvider trait with chat_with_metrics() and chat_stream_with_metrics()
Add .enable_metrics(bool) builder method
Integrate with existing StreamChunk::Usage variant for token tracking

Motivation

I realized I was building most of these features into multiple App engines, so thought I'd push down and share. Users need visibility into LLM request performance and costs. This feature provides:

Token usage tracking (input/output/total)
Time to first token measurement
Total request duration
Works with both streaming and non-streaming APIs

Approach

Opt-in design: Metrics collection is disabled by default. Users must explicitly enable it via .enable_metrics(true) on the builder. This ensures zero overhead for users who don't need metrics.

Non-breaking: All new APIs are additive. Existing chat() and chat_stream() methods work unchanged.

Usage

Enable Metrics

let llm = LLMBuilder::new()
    .backend(LLMBackend::OpenAI)
    .api_key(&key)
    .enable_metrics(true)
    .build()?;

Non-Streaming

use llm::metrics::MetricsProvider;

let messages = vec![ChatMessage::user().content("Hello").build()];
let (response, metrics) = llm.chat_with_metrics(&messages).await?;

if let Some(m) = metrics {
    println!("Input tokens: {}", m.input_tokens.unwrap_or(0));
    println!("Output tokens: {}", m.output_tokens.unwrap_or(0));
    println!("Time to first token: {:?}", m.time_to_first_token);
    println!("Total duration: {:?}", m.total_duration);
}

Streaming

use llm::chat::Tracked;
use futures::StreamExt;

let tracked: Tracked<_> = llm.chat_stream_with_metrics(&messages).await?;

futures::pin_mut!(tracked);
while let Some(chunk) = tracked.next().await {
    match chunk? {
        StreamChunk::Delta(text) => print!("{}", text),
        StreamChunk::Usage { input_tokens, output_tokens, .. } => {
            println!("Tokens: {}/{}", input_tokens, output_tokens);
        }
        _ => {}
    }
}

let metrics = tracked.metrics();
println!("Total duration: {:?}", metrics.total_duration);

Changes

New Files

File	Purpose
`src/metrics.rs`	`ChatMetrics` struct and `MetricsProvider` trait
`src/chat/tracked.rs`	`Tracked<S>` stream wrapper

Modified Files

File	Changes
`src/lib.rs`	Export `metrics` module
`src/chat/mod.rs`	Export `Tracked`
`src/builder.rs`	Add `enable_metrics` field and method

Test Plan

cargo check passes
cargo clippy passes
cargo test passes
cargo build --release passes

Dependencies

This feature integrates with PR #96 (stream_usage) which adds StreamChunk::Usage variant. The metrics collection leverages this to capture token counts during streaming.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2025-12-31T17:07:06Z

src/chat/tracked.rs

+                // Record first chunk time
+                if this.first_chunk_time.is_none() {
+                    *this.first_chunk_time = Some(Instant::now());
+                }


Record TTFT only when a text token is seen

Here first_chunk_time is set on the very first streamed item, even if that item is a tool-use event or an empty/usage-only chunk. For tool-only responses or streams that send ToolUseStart/Done before any text, time_to_first_token becomes Some(...) even though no token was emitted, which makes the metric inaccurate. Consider setting first_chunk_time only when extract_text() returns Some (and ideally non-empty), so TTFT reflects the first actual text token.

Useful? React with 👍 / 👎.

chatgpt-codex-connector bot reviewed Dec 31, 2025

View reviewed changes

nazq force-pushed the feat/metrics-collection branch from 4073506 to 871b0b2 Compare December 31, 2025 17:10

feat: add opt-in metrics collection for chat requests

b0b056b

nazq force-pushed the feat/metrics-collection branch from 871b0b2 to b0b056b Compare December 31, 2025 17:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat: Add opt-in metrics collection for chat requests#102

feat: Add opt-in metrics collection for chat requests#102
nazq wants to merge 1 commit intograniet:mainfrom
nazq:feat/metrics-collection

nazq commented Dec 31, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

nazq commented Dec 31, 2025

Summary

Motivation

Approach

Usage

Enable Metrics

Non-Streaming

Streaming

Changes

New Files

Modified Files

Test Plan

Dependencies

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Dec 31, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant