Skip to content

[GitHub-462] Reasoning Token Count Support#463

Open
jacekzimonski wants to merge 4 commits intomainfrom
jacek/GitHub-462-Reasoning-Token-Count-Support
Open

[GitHub-462] Reasoning Token Count Support#463
jacekzimonski wants to merge 4 commits intomainfrom
jacek/GitHub-462-Reasoning-Token-Count-Support

Conversation

@jacekzimonski
Copy link
Collaborator

No description provided.

@vercel
Copy link

vercel bot commented Oct 13, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
anotherai Ready Ready Preview Comment Oct 13, 2025 2:34pm
anotherai-docs Ready Ready Preview Comment Oct 13, 2025 2:34pm

@claude
Copy link
Contributor

claude bot commented Oct 13, 2025

Claude finished @jacekzimonski's task —— View job


PR Review: Reasoning Token Count Support

Reviewing the implementation of reasoning token count support across backend and frontend.

Review Tasks:

  • Analyze backend domain and storage changes
  • Review API models and service implementations
  • Examine frontend UI components and utilities
  • Check test coverage and quality
  • Assess security and performance implications
  • Provide comprehensive feedback

Summary

This PR successfully implements reasoning token count support across the AnotherAI platform, adding comprehensive tracking and display of reasoning tokens used by LLM models. The implementation is well-architected and follows existing patterns.

Code Quality & Architecture ✅

Backend Implementation

Excellent - The backend changes are well-structured:

  • backend/core/domain/experiment.py:27 - Clean addition of reasoning_token_count field to ExperimentOutput
  • backend/protocol/api/_api_models.py:366 - Proper addition to CompletionUsage model
  • backend/protocol/api/_services/experiment_service.py:88-104 - Smart implementation that fetches traces from ClickHouse and extracts reasoning tokens
  • backend/protocol/api/_services/conversions.py:850,922 - Consistent conversion logic between domain and API models

Frontend Implementation

Excellent - Frontend changes demonstrate solid engineering:

  • web/src/types/models.ts:159,215 - Clean type definitions for reasoning tokens
  • web/src/components/utils/utils.ts:1177-1304 - Robust utility functions with comprehensive reasoning token handling
  • Consistent integration across UI components (MetricItem, MatrixSection, TracesView, CompletionCell)

Test Coverage ✅

Good - The test coverage for new functionality is comprehensive:

  • web/src/components/utils/__tests__/utils.test.ts:344-430 - Excellent test coverage for reasoning token utilities
  • Tests cover edge cases like undefined values, error states, and trace parsing
  • Tests validate both ExperimentCompletion direct field access and trace fallback logic

Performance Considerations ⚠️

Minor Concern - One potential optimization:

  • backend/protocol/api/_services/experiment_service.py:91-92 - The service excludes large fields when fetching traces, which is good for performance
  • However, the traces parsing (_calculate_reasoning_tokens_from_traces) runs on every experiment fetch
  • Consider caching or pre-computing reasoning token totals if this becomes a bottleneck

Security Assessment ✅

Secure - No security concerns identified:

  • All new fields follow existing patterns for data validation and sanitization
  • No direct user input handling in the reasoning token logic
  • Appropriate error handling for malformed trace data

Implementation Highlights ✅

Smart Trace Parsing

The reasoning token extraction logic is robust:

# backend/protocol/api/_services/experiment_service.py:119-152
def _calculate_reasoning_tokens_from_traces(self, traces):
    # Handles both new detailed usage structure and graceful fallbacks
    # Only counts LLM traces, properly validates field existence

Comprehensive Frontend Integration

The frontend seamlessly integrates reasoning tokens:

// web/src/components/utils/utils.ts:1184-1224
export function getReasoningTokenCount(completion) {
    // Smart handling of both ExperimentCompletion field and trace parsing
    // Proper undefined handling for backward compatibility
}

UI Components Enhancement

The MetricItem component properly categorizes reasoning as a metric type:

// web/src/components/MetricItem.tsx:74-76
if (metricKey.includes("reasoning")) {
    return "reasoning";
}

Minor Issues & Suggestions

  1. Type Safety - Consider adding runtime validation for trace structure in _calculate_reasoning_tokens_from_traces

  2. ESLint Config - The ESLint changes appear unrelated to reasoning tokens:

    // web/eslint.config.mjs:26-36 - Added import restrictions

    These seem like a separate improvement but are fine to include.

  3. Error Handling - The trace parsing logic could benefit from more specific error logging when trace structure is unexpected.

Recommendations ✅

  1. Monitor Performance - Keep an eye on experiment loading times with large numbers of traces
  2. Consider Caching - If reasoning token calculation becomes expensive, consider pre-computing during completion storage
  3. Add Metric - Consider adding monitoring for reasoning token usage patterns

Overall Assessment

Approved ✅ This is a high-quality implementation that:

  • Follows existing architectural patterns
  • Includes comprehensive test coverage
  • Handles edge cases gracefully
  • Integrates seamlessly across the UI
  • Maintains backward compatibility
  • Has no security concerns

The reasoning token support is well-implemented and ready for production. The code quality is excellent and the feature adds valuable insights for AI model usage analysis.

--- | Branch: jacek/GitHub-462-Reasoning-Token-Count-Support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant