-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Root cause
_infer_model_from_metrics() in parser.py is a standalone function with three distinct branches, but it is never imported or directly tested. It is only exercised implicitly when build_session_summary calls it for completed sessions that have no currentModel on the shutdown event. The most important branch — picking the model with the highest requests.count when multiple models are present — is never explicitly covered, and the tie-breaking behaviour when two models share the same count is entirely undocumented.
The function
# parser.py
def _infer_model_from_metrics(metrics: dict[str, ModelMetrics]) -> str | None:
if not metrics:
return None
if len(metrics) == 1:
return next(iter(metrics))
return max(metrics, key=lambda m: metrics[m].requests.count)Branches with no test
| Branch | Condition | Current coverage |
|---|---|---|
| Empty dict | not metrics |
❌ never called with {} directly |
| Single key | len(metrics) == 1 |
❌ never called directly; only indirectly via build_session_summary |
| Multi-key — clear winner | one model has higher count | ❌ not verified that the correct model is returned |
| Multi-key — tie | two models have identical count | ❌ max() returns the first maximum found in iteration order (insertion order); result is implementation-defined and could silently change |
Tie-breaking non-determinism
When requests.count is equal across models, Python's max() returns the first value encountered in the dict (which in CPython 3.7+ means insertion order). This means the model shown in the CLI depends on the order events happened to be recorded — a perfectly valid alternative session replay could produce a different model name. There is no test to document which model wins or to alert if the behaviour changes.
Related gap: build_session_summary multi-model completed session without currentModel
The only path that reaches _infer_model_from_metrics in integration is a completed session where neither ev.currentModel nor data.currentModel is set. This specific combination (shutdown event missing currentModel at the top level AND in data, but having modelMetrics with two or more models) has no test in test_parser.py.
Expected behaviour to assert
Direct unit tests for _infer_model_from_metrics
from copilot_usage.parser import _infer_model_from_metrics # pyright: ignore[reportPrivateUsage]
from copilot_usage.models import ModelMetrics, RequestMetrics
def test_infer_empty_returns_none() -> None:
assert _infer_model_from_metrics({}) is None
def test_infer_single_key_returns_it() -> None:
metrics = {"claude-sonnet-4": ModelMetrics(requests=RequestMetrics(count=5))}
assert _infer_model_from_metrics(metrics) == "claude-sonnet-4"
def test_infer_multi_key_returns_highest_count() -> None:
metrics = {
"claude-sonnet-4": ModelMetrics(requests=RequestMetrics(count=3)),
"claude-opus-4.6": ModelMetrics(requests=RequestMetrics(count=10)),
}
assert _infer_model_from_metrics(metrics) == "claude-opus-4.6"
def test_infer_tie_returns_a_model_deterministically() -> None:
"""When counts are equal, a specific model must always win.
This test documents (and locks in) the current tie-breaking behaviour.
If the behaviour changes, the test should be updated intentionally.
"""
metrics = {
"model-a": ModelMetrics(requests=RequestMetrics(count=5)),
"model-b": ModelMetrics(requests=RequestMetrics(count=5)),
}
result = _infer_model_from_metrics(metrics)
# Both are valid; the test asserts a specific one to detect unintentional changes.
assert result in ("model-a", "model-b")
# Run again to confirm stability within a single process
assert _infer_model_from_metrics(metrics) == resultIntegration test: completed session without currentModel
def test_build_summary_infers_model_when_current_model_absent(tmp_path):
"""Shutdown event with no currentModel → _infer_model_from_metrics is used."""
# Shutdown event with two models; neither currentModel field is present
shutdown = json.dumps({
"type": "session.shutdown",
"data": {
"shutdownType": "routine",
"totalPremiumRequests": 8,
"totalApiDurationMs": 5000,
"modelMetrics": {
"claude-sonnet-4": {"requests": {"count": 3, "cost": 3}, "usage": {...}},
"claude-opus-4.6": {"requests": {"count": 8, "cost": 8}, "usage": {...}},
},
# no "currentModel" in data
},
# no "currentModel" at top level
})
events = parse_events(...)
summary = build_session_summary(events)
assert summary.model == "claude-opus-4.6" # highest count winsWhy this matters
The inferred model name drives two user-facing outputs:
- The Model column in every rendered table.
- The estimated premium cost calculation in
render_cost_viewandrender_live_sessions(via_estimate_premium_cost, which callslookup_model_pricing(model)).
A wrong model (e.g., a free model selected over a premium one due to a tie) would cause the cost estimate to be silently wrong. Direct tests for _infer_model_from_metrics provide a safety net for this invariant.
Generated by Test Suite Analysis · ◷