feat: Enhanced performance monitoring and telemetry infrastructure by eLyiN · Pull Request #2157 · google-gemini/gemini-cli

eLyiN · 2025-06-27T08:24:15Z

Summary

This PR implements comprehensive performance monitoring capabilities for the Gemini CLI as proposed in issue #2127. The implementation extends the existing OpenTelemetry infrastructure with detailed startup instrumentation, memory usage tracking, and advanced metrics collection.

What's Changed

🚀 Detailed Startup Performance Monitoring

Phase-by-phase timing for all CLI startup phases
Rich attributes including auth type, telemetry settings, workspace info
Comprehensive instrumentation covering settings loading, cleanup, extensions, config loading, service initialization
Helps diagnose startup issues like After starting, I saw some information and then exited on my own without seeing any prompt #1685

🧠 Comprehensive Memory Monitoring

Real-time memory tracking with new MemoryMonitor class
Automatic snapshots at key lifecycle points (startup, major operations)
Component-specific monitoring and memory growth analysis
Detailed heap statistics including RSS, external memory, heap usage
Helps detect memory issues like rate Limit Detected #1687

📊 Enhanced Metrics Framework

12 new performance metrics following OpenTelemetry patterns
Performance scoring and regression detection capabilities
Memory usage, CPU usage, tool execution breakdown metrics
Startup timing, token efficiency, API request breakdown tracking

Technical Implementation

✅ Extends existing OpenTelemetry infrastructure without breaking changes
✅ Follows established patterns (gemini_cli.* naming convention)
✅ Integrates with current configuration (local/GCP targets)
✅ Performance overhead designed to be <1% of total execution
✅ Maintains API compatibility and existing telemetry controls

Files Changed

packages/core/src/telemetry/constants.ts: 12 new performance metrics constants
packages/core/src/telemetry/metrics.ts: Enhanced recording functions (344 lines)
packages/core/src/telemetry/memory-monitor.ts: New memory monitoring infrastructure (263 lines)
packages/core/src/telemetry/index.ts: Updated exports for new functionality
packages/cli/src/gemini.tsx: Startup performance instrumentation

Testing

✅ TypeScript compilation: All packages compile without errors
✅ Build verification: Both CLI and Core packages build successfully
✅ API compatibility: No breaking changes to existing telemetry API

Benefits

Better issue diagnosis: Resolve startup issues like After starting, I saw some information and then exited on my own without seeing any prompt #1685 with detailed timing data
Memory issue detection: Catch memory problems like rate Limit Detected #1687 before they impact users
Performance optimization: Provide data for optimizing CLI performance at scale
Backwards compatible: Seamless integration with existing telemetry infrastructure

Usage

The performance monitoring is automatically enabled when telemetry is enabled:

# Enable telemetry to get performance monitoring
gemini --telemetry

# Or via environment
export GEMINI_TELEMETRY_ENABLED=true

Performance metrics are collected alongside existing telemetry data and sent to the same OTLP endpoints.

Related Issues

Closes Enhanced Performance Monitoring: Startup instrumentation, memory tracking, and detailed metrics collection #2127
Helps diagnose After starting, I saw some information and then exited on my own without seeing any prompt #1685 (startup/exit issues)
Helps detect rate Limit Detected #1687 (memory issues)

Notes

This implementation provides the foundation for performance monitoring. Future enhancements could include real-time dashboards and automated alerting as discussed in #2127.

umairidris · 2025-06-29T18:14:27Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces performance monitoring and telemetry infrastructure. The review identified critical issues related to PII exposure and potential division-by-zero errors in metric calculations, as well as a high-severity issue related to double-counted metrics. Addressing these issues will improve the security and reliability of the new infrastructure.

eLyiN · 2025-06-29T19:00:31Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces performance monitoring and telemetry infrastructure. I've identified a couple of high-severity issues: potential for multiple memory monitors and division-by-zero in baseline comparisons. Addressing these will improve reliability.

eLyiN · 2025-06-29T19:07:24Z

/gemini review

gemini-code-assist

Code Review

This PR introduces a comprehensive performance monitoring and telemetry infrastructure. The changes are extensive and well-thought-out. I've provided a few comments on potential improvements, focusing on error handling to prevent silent failures and ensure the new telemetry is as clear and useful as possible. Overall, this is a great addition to the project.

eLyiN · 2025-06-29T19:16:26Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a comprehensive performance monitoring and telemetry infrastructure, which is a great enhancement for diagnosing performance issues and memory leaks. The implementation is well-structured and extends the existing OpenTelemetry setup effectively.

I've identified one high-severity issue in the new MemoryMonitor class. The periodic timer for memory snapshots is not being 'unreferenced', which will prevent the CLI process from exiting cleanly when it should. I've provided a suggestion to resolve this. Otherwise, the changes look solid.

eLyiN · 2025-06-29T19:18:47Z

/gemini review

gemini-code-assist

Code Review

This pull request significantly enhances the CLI's performance monitoring and telemetry infrastructure by integrating detailed startup timing, comprehensive memory tracking, and new performance metrics. The changes are well-structured and align with the stated objectives of the PR.

eLyiN · 2025-06-29T19:37:32Z

/gemini review

gemini-code-assist

Code Review

This PR introduces performance monitoring and telemetry. There's a missing newline in packages/core/src/telemetry/index.ts. The MemoryMonitor singleton in packages/core/src/telemetry/memory-monitor.ts stores a config object, which can cause issues with metric attribution in multi-session scenarios. It needs refactoring to avoid storing the config object.

jerop

@eLyiN thank you for making these improvements to OpenTelemetry

Please update the docs and add tests

eLyiN · 2025-06-29T21:13:56Z

Hi @jerop, thank you for reviewing my pull request. I've made the requested changes and also fixed some TypeScript issues that were causing problems with the npm run preflight command. Ready for review.

jerop · 2025-07-08T15:20:50Z

@eLyiN apologies for the delay here, could you please rebase this change and will prioritize getting it in?

eLyiN · 2025-07-08T16:33:02Z

@eLyiN apologies for the delay here, could you please rebase this change and will prioritize getting it in?

@jerop No worries! Ready for merge!

jerop

Getting this error when I try this out:

INT value type cannot accept a floating-point value for gemini_cli.startup.duration, ignoring the  fractional digits.

It looks like the recordStartupPerformance function in packages/core/src/telemetry/metrics.ts is receiving a floating-point number for the duration, but the underlying metric startupTimeHistogram is configured to only accept integers.

- Update telemetry.md with comprehensive activity monitoring docs - Document high water mark tracking and rate limiting - Add configuration examples and performance impact details - Explain activity-driven monitoring architecture

- Fix recordActivity function to properly forward arguments to ActivityMonitor - Update test implementations to match actual function signatures - Correct import paths for different activity recording use cases - Ensure activity-driven memory monitoring works as designed This restores the intended behavior of the activity monitoring system.

- Introduce trackStartupPerformance wrapper function to reduce verbosity - Replace repetitive start/end timing patterns with more terse API - Add Chrome DevTools integration for development builds - Update settings_sources from hardcoded 2 to correct value 3 Improves code maintainability and developer experience.

- Remove smoothing logic and weighted averaging from memory tracking - Use current values directly for threshold comparison instead of smoothed readings - Eliminate recentReadings buffer and related complexity - Simplify constructor to only accept growth threshold parameter Results in cleaner, more predictable memory monitoring behavior.

- Remove unnecessary comment from monitoring interval configuration - Update memory monitor constructor to match simplified high-water mark tracker - Fix minor formatting in telemetry documentation Improves code clarity and maintains consistency across telemetry components.

- Update useActivityMonitoring tests to use proper mock setup with shared references - Fix test expectations to match new ActivityMonitor.recordActivity implementation - Remove obsolete getGrowthPercentage tests after smoothing removal - Update high-water mark tracker constructor calls to match simplified API Ensures all tests pass with the updated activity monitoring architecture.

- Fix TypeScript strict mode violations in test files - Ensure all test imports and function signatures are correct - Clean up unused variables and maintain code quality standards All preflight checks now pass with 3,448 tests successfully running.

- Remove extra blank line in memory-monitor.test.ts after line 464 - Remove extra blank line in memory-monitor.ts after line 285 - Ensures code passes CI formatter check (git diff --exit-code)

- Fixed parseArguments call in gemini.tsx to use correct parameter - Resolved TypeScript import type issues for consistent-type-imports rule - Updated import/export patterns to separate type and value imports - Maintained all performance monitoring and telemetry functionality - All tests passing and linting clean

…rics - Update authentication type handling to ensure 'unset' is recorded when no type is selected - Modify theme name handling to default to 'unset' if no theme is specified - Clean up commented code in fileUtils for better clarity This improves the accuracy of performance monitoring data during startup.

- Added CURSOR_TRACE_ID environment variable stubs in multiple tests to improve IDE detection accuracy. - Updated gemini.test.tsx to include mock implementations for getFileService and getCheckpointingEnabled. - Refactored activity-monitor tests to import ActivityType from the correct module, ensuring consistency in activity tracking. These changes improve the robustness of the testing framework and enhance the accuracy of IDE detection logic.

eLyiN · 2025-09-09T07:41:07Z

up!

jacob314 · 2025-09-09T16:26:00Z

Hope we can this landed soon. It is needed for #7329.

jerop · 2025-09-09T19:29:29Z

@jacob314 @eLyiN apologies for the delay, removed the blocker from my "request changes"

eLyiN · 2025-09-09T19:32:48Z

Don’t worry, @jerop! I pushed some fixes to make sure npm ci passes again. It took quite a bit of continuous rebasing, but everything looks fine now.

jerop · 2025-09-09T19:41:21Z

@eLyiN any chance you can split this PR into smaller changes that we can merge at a time? this is 4800+ line change in one PR

request to break down into smaller changes

eLyiN · 2025-09-09T21:22:28Z

@jerop @jacob314 I’ve split the original #2157 PR into 6 smaller, thematic PRs. Each PR builds and tests independently, with exports gated behind isPerformanceMonitoringActive so there are no runtime behaviour changes until later stages. This gives clear scope visibility, a clean dependency chain, and manageable review chunks.

For tracking, I’ve created issue #8120 to follow the overall progress.

We’re starting with PR #8110, which is ready for review. The remaining branches are prepared and will be opened as the sequence progresses.

jacob314 · 2025-09-09T22:42:32Z

Thank you for splitting this into multiple PRs! Closing this now to keep my list of prs I am reviewing accurate.

eLyiN requested a review from a team as a code owner June 27, 2025 08:24

eLyiN force-pushed the feature/enhanced-performance-monitoring branch 2 times, most recently from d580d01 to d396283 Compare June 28, 2025 06:24