feat: display llm usage data by phact · Pull Request #784 · langflow-ai/openrag

phact · 2026-01-14T18:15:42Z

Depends on an updated langflow responses endpoint:

edwinjosechittilappilly · 2026-01-16T14:10:54Z

LF PR approved awaiting merge!

edwinjosechittilappilly

LGTM,

I believe we can merge this PR and the funtionality will ne enabled once the usuage Data is available in Responses API in LF after LF upgrade.

Copilot

Pull request overview

Adds end-to-end support for exposing and displaying LLM token usage (from response.completed / Responses API usage payloads) across streaming, persisted chat history, and UI rendering.

Changes:

Backend: capture usage from streamed response.completed events, persist it on assistant messages, and expose it via the v1 chat GET endpoint.
Frontend: introduce a TokenUsage type, capture usage from streaming events, and render token usage in assistant messages.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
src/api/v1/chat.py	Extends v1 conversation response mapping to include per-message token usage when available.
src/agent.py	Captures `usage` from `response.completed` during streaming; persists usage into message `response_data`; adds logging.
frontend/hooks/useChatStreaming.ts	Captures `response.completed` usage during streaming and attaches it to the final assistant message.
frontend/app/chat/page.tsx	Extracts usage from historical `response_data`, passes usage into `AssistantMessage`, and sets usage on non-streaming results.
frontend/app/chat/_types/types.ts	Adds `TokenUsage` and `Message.usage` typing.
frontend/app/chat/_components/token-usage.tsx	New UI component to display token usage.
frontend/app/chat/_components/assistant-message.tsx	Renders `TokenUsage` for completed (non-streaming) assistant messages.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-22T16:43:42Z

+						if (msg.response_data && typeof msg.response_data === "object") {
+							const responseData =
+								typeof msg.response_data === "string"
+									? JSON.parse(msg.response_data)
+									: msg.response_data;
+							if (responseData.usage) {
+								message.usage = responseData.usage;
+							}
+						}


The current guard if (msg.response_data && typeof msg.response_data === "object") makes the subsequent typeof msg.response_data === "string" ? JSON.parse(...) branch unreachable, so usage will never be extracted when response_data is actually a string. Also, JSON.parse here can throw and break conversation loading if response_data is non-JSON. Consider widening the guard to accept string | object and wrapping parsing in a try/catch (or a small safe-parse helper) before reading .usage.

Copilot · 2026-02-22T16:43:42Z

-            })
+            }
+            # Include token usage if available (from Responses API)
+            usage = msg.get("response_data", {}).get("usage") if isinstance(msg.get("response_data"), dict) else None


response_data from Langflow/history may be serialized as a JSON string (the frontend already treats it as possibly-string). This code only extracts usage when response_data is a dict, so usage will be silently omitted for string payloads. Consider normalizing response_data once (e.g., parse JSON strings when possible) and then reading usage from the normalized object; also avoid calling msg.get("response_data") multiple times in the same expression for clarity.

Suggested change

usage = msg.get("response_data", {}).get("usage") if isinstance(msg.get("response_data"), dict) else None

response_data = msg.get("response_data")

if isinstance(response_data, str):

try:

response_data = json.loads(response_data)

except Exception:

# If parsing fails, leave response_data as-is (usage will be omitted)

response_data = None

usage = response_data.get("usage") if isinstance(response_data, dict) else None

Copilot · 2026-02-22T16:43:43Z

+                # Detect response.completed event and log usage
+                if isinstance(chunk_data, dict) and chunk_data.get("type") == "response.completed":
+                    response_data = chunk_data.get("response", {})
+                    usage = response_data.get("usage")
+                    if usage:
+                        logger.info(
+                            "Stream usage data",
+                            input_tokens=usage.get("input_tokens"),
+                            output_tokens=usage.get("output_tokens"),
+                            total_tokens=usage.get("total_tokens"),
+                        )


Logging token usage at info level for every streamed response.completed event can generate a lot of log volume in production and may be inconsistent with nearby per-chunk logging (which is debug). Consider lowering this to debug (or gating behind a feature flag / sampling) to reduce operational noise while still allowing investigation when needed.

Copilot · 2026-02-22T16:43:43Z

+            # Capture usage from response.completed event
+            if chunk_data.get("type") == "response.completed":
+                response_obj = chunk_data.get("response", {})
+                usage_data = response_obj.get("usage")
        except:
            pass


The bare except: pass here will also swallow asyncio.CancelledError and any unexpected decoding/parsing errors, which can break cooperative cancellation and make stream issues extremely hard to diagnose. Prefer except Exception as e with at least a debug log, and let CancelledError propagate.

Copilot · 2026-02-22T16:43:43Z

        except:
            pass


Same issue as the other stream: except: pass will swallow asyncio.CancelledError and hide JSON decoding problems, which can lead to stuck/cancel-ignoring requests and makes debugging difficult. Prefer catching Exception (and logging) while allowing cancellation to propagate.

Suggested change

except:

pass

except Exception as e:

logger.warning(f"Failed to parse langflow chunk: {e}")

phact added 2 commits January 13, 2026 14:33

add usage data to chat backend

8c26e03

usage frontend

38072b2

edwinjosechittilappilly self-requested a review January 16, 2026 14:10

Merge branch 'main' into usage-data

08f675c

Merge branch 'main' into usage-data

5b44aac

aimurphy mentioned this pull request Feb 6, 2026

[Docs]: Token usage in chat #912

Closed

1 task

Merge branch 'main' into usage-data

2e4b575

edwinjosechittilappilly requested a review from Copilot February 22, 2026 16:39

Merge branch 'main' into usage-data

79b4a23

Copilot started reviewing on behalf of edwinjosechittilappilly February 22, 2026 16:40 View session

edwinjosechittilappilly approved these changes Feb 22, 2026

View reviewed changes

Copilot AI reviewed Feb 22, 2026

View reviewed changes

Merge branch 'main' into usage-data

71eb413

edwinjosechittilappilly merged commit 4b50408 into main Feb 24, 2026
4 checks passed

edwinjosechittilappilly deleted the usage-data branch February 24, 2026 16:18

edwinjosechittilappilly added the enhancement 🔵 New feature or request label Feb 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: display llm usage data#784

feat: display llm usage data#784
edwinjosechittilappilly merged 7 commits into
mainfrom
usage-data

phact commented Jan 14, 2026

Uh oh!

edwinjosechittilappilly commented Jan 16, 2026

Uh oh!

edwinjosechittilappilly left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 22, 2026

Uh oh!

Copilot AI Feb 22, 2026

Uh oh!

Copilot AI Feb 22, 2026

Uh oh!

Copilot AI Feb 22, 2026

Uh oh!

Copilot AI Feb 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-            usage = msg.get("response_data", {}).get("usage") if isinstance(msg.get("response_data"), dict) else None
+            response_data = msg.get("response_data")
+            if isinstance(response_data, str):
+                try:
+                    response_data = json.loads(response_data)
+                except Exception:
+                    # If parsing fails, leave response_data as-is (usage will be omitted)
+                    response_data = None
+            usage = response_data.get("usage") if isinstance(response_data, dict) else None

-        except:
-            pass
+        except Exception as e:
+            logger.warning(f"Failed to parse langflow chunk: {e}")

Conversation

phact commented Jan 14, 2026

Uh oh!

edwinjosechittilappilly commented Jan 16, 2026

Uh oh!

edwinjosechittilappilly left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants