Skip to content

feat: display llm usage data#784

Merged
edwinjosechittilappilly merged 7 commits into
mainfrom
usage-data
Feb 24, 2026
Merged

feat: display llm usage data#784
edwinjosechittilappilly merged 7 commits into
mainfrom
usage-data

Conversation

@phact
Copy link
Copy Markdown
Collaborator

@phact phact commented Jan 14, 2026

Depends on an updated langflow responses endpoint:

langflow-ai/langflow#11302

@edwinjosechittilappilly
Copy link
Copy Markdown
Collaborator

LF PR approved awaiting merge!

@aimurphy aimurphy mentioned this pull request Feb 6, 2026
1 task
Copy link
Copy Markdown
Collaborator

@edwinjosechittilappilly edwinjosechittilappilly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,

I believe we can merge this PR and the funtionality will ne enabled once the usuage Data is available in Responses API in LF after LF upgrade.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds end-to-end support for exposing and displaying LLM token usage (from response.completed / Responses API usage payloads) across streaming, persisted chat history, and UI rendering.

Changes:

  • Backend: capture usage from streamed response.completed events, persist it on assistant messages, and expose it via the v1 chat GET endpoint.
  • Frontend: introduce a TokenUsage type, capture usage from streaming events, and render token usage in assistant messages.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/api/v1/chat.py Extends v1 conversation response mapping to include per-message token usage when available.
src/agent.py Captures usage from response.completed during streaming; persists usage into message response_data; adds logging.
frontend/hooks/useChatStreaming.ts Captures response.completed usage during streaming and attaches it to the final assistant message.
frontend/app/chat/page.tsx Extracts usage from historical response_data, passes usage into AssistantMessage, and sets usage on non-streaming results.
frontend/app/chat/_types/types.ts Adds TokenUsage and Message.usage typing.
frontend/app/chat/_components/token-usage.tsx New UI component to display token usage.
frontend/app/chat/_components/assistant-message.tsx Renders TokenUsage for completed (non-streaming) assistant messages.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +507 to +515
if (msg.response_data && typeof msg.response_data === "object") {
const responseData =
typeof msg.response_data === "string"
? JSON.parse(msg.response_data)
: msg.response_data;
if (responseData.usage) {
message.usage = responseData.usage;
}
}
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current guard if (msg.response_data && typeof msg.response_data === "object") makes the subsequent typeof msg.response_data === "string" ? JSON.parse(...) branch unreachable, so usage will never be extracted when response_data is actually a string. Also, JSON.parse here can throw and break conversation loading if response_data is non-JSON. Consider widening the guard to accept string | object and wrapping parsing in a try/catch (or a small safe-parse helper) before reading .usage.

Copilot uses AI. Check for mistakes.
Comment thread src/api/v1/chat.py
})
}
# Include token usage if available (from Responses API)
usage = msg.get("response_data", {}).get("usage") if isinstance(msg.get("response_data"), dict) else None
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

response_data from Langflow/history may be serialized as a JSON string (the frontend already treats it as possibly-string). This code only extracts usage when response_data is a dict, so usage will be silently omitted for string payloads. Consider normalizing response_data once (e.g., parse JSON strings when possible) and then reading usage from the normalized object; also avoid calling msg.get("response_data") multiple times in the same expression for clarity.

Suggested change
usage = msg.get("response_data", {}).get("usage") if isinstance(msg.get("response_data"), dict) else None
response_data = msg.get("response_data")
if isinstance(response_data, str):
try:
response_data = json.loads(response_data)
except Exception:
# If parsing fails, leave response_data as-is (usage will be omitted)
response_data = None
usage = response_data.get("usage") if isinstance(response_data, dict) else None

Copilot uses AI. Check for mistakes.
Comment thread src/agent.py
Comment on lines +200 to +210
# Detect response.completed event and log usage
if isinstance(chunk_data, dict) and chunk_data.get("type") == "response.completed":
response_data = chunk_data.get("response", {})
usage = response_data.get("usage")
if usage:
logger.info(
"Stream usage data",
input_tokens=usage.get("input_tokens"),
output_tokens=usage.get("output_tokens"),
total_tokens=usage.get("total_tokens"),
)
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging token usage at info level for every streamed response.completed event can generate a lot of log volume in production and may be inconsistent with nearby per-chunk logging (which is debug). Consider lowering this to debug (or gating behind a feature flag / sampling) to reduce operational noise while still allowing investigation when needed.

Copilot uses AI. Check for mistakes.
Comment thread src/agent.py
Comment on lines +522 to 527
# Capture usage from response.completed event
if chunk_data.get("type") == "response.completed":
response_obj = chunk_data.get("response", {})
usage_data = response_obj.get("usage")
except:
pass
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bare except: pass here will also swallow asyncio.CancelledError and any unexpected decoding/parsing errors, which can break cooperative cancellation and make stream issues extremely hard to diagnose. Prefer except Exception as e with at least a debug log, and let CancelledError propagate.

Copilot uses AI. Check for mistakes.
Comment thread src/agent.py
Comment on lines 728 to 729
except:
pass
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as the other stream: except: pass will swallow asyncio.CancelledError and hide JSON decoding problems, which can lead to stuck/cancel-ignoring requests and makes debugging difficult. Prefer catching Exception (and logging) while allowing cancellation to propagate.

Suggested change
except:
pass
except Exception as e:
logger.warning(f"Failed to parse langflow chunk: {e}")

Copilot uses AI. Check for mistakes.
@edwinjosechittilappilly edwinjosechittilappilly merged commit 4b50408 into main Feb 24, 2026
4 checks passed
@edwinjosechittilappilly edwinjosechittilappilly deleted the usage-data branch February 24, 2026 16:18
@edwinjosechittilappilly edwinjosechittilappilly added the enhancement 🔵 New feature or request label Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement 🔵 New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants