Skip to content

feat: Enhanced performance monitoring and telemetry infrastructure#2157

Closed
eLyiN wants to merge 31 commits intogoogle-gemini:mainfrom
eLyiN:feature/enhanced-performance-monitoring
Closed

feat: Enhanced performance monitoring and telemetry infrastructure#2157
eLyiN wants to merge 31 commits intogoogle-gemini:mainfrom
eLyiN:feature/enhanced-performance-monitoring

Conversation

@eLyiN
Copy link
Copy Markdown
Contributor

@eLyiN eLyiN commented Jun 27, 2025

Summary

This PR implements comprehensive performance monitoring capabilities for the Gemini CLI as proposed in issue #2127. The implementation extends the existing OpenTelemetry infrastructure with detailed startup instrumentation, memory usage tracking, and advanced metrics collection.

What's Changed

🚀 Detailed Startup Performance Monitoring

🧠 Comprehensive Memory Monitoring

  • Real-time memory tracking with new MemoryMonitor class
  • Automatic snapshots at key lifecycle points (startup, major operations)
  • Component-specific monitoring and memory growth analysis
  • Detailed heap statistics including RSS, external memory, heap usage
  • Helps detect memory issues like rate Limit Detected #1687

📊 Enhanced Metrics Framework

  • 12 new performance metrics following OpenTelemetry patterns
  • Performance scoring and regression detection capabilities
  • Memory usage, CPU usage, tool execution breakdown metrics
  • Startup timing, token efficiency, API request breakdown tracking

Technical Implementation

  • Extends existing OpenTelemetry infrastructure without breaking changes
  • Follows established patterns (gemini_cli.* naming convention)
  • Integrates with current configuration (local/GCP targets)
  • Performance overhead designed to be <1% of total execution
  • Maintains API compatibility and existing telemetry controls

Files Changed

  • packages/core/src/telemetry/constants.ts: 12 new performance metrics constants
  • packages/core/src/telemetry/metrics.ts: Enhanced recording functions (344 lines)
  • packages/core/src/telemetry/memory-monitor.ts: New memory monitoring infrastructure (263 lines)
  • packages/core/src/telemetry/index.ts: Updated exports for new functionality
  • packages/cli/src/gemini.tsx: Startup performance instrumentation

Testing

  • TypeScript compilation: All packages compile without errors
  • Build verification: Both CLI and Core packages build successfully
  • API compatibility: No breaking changes to existing telemetry API

Benefits

  1. Better issue diagnosis: Resolve startup issues like After starting, I saw some information and then exited on my own without seeing any prompt #1685 with detailed timing data
  2. Memory issue detection: Catch memory problems like rate Limit Detected #1687 before they impact users
  3. Performance optimization: Provide data for optimizing CLI performance at scale
  4. Backwards compatible: Seamless integration with existing telemetry infrastructure

Usage

The performance monitoring is automatically enabled when telemetry is enabled:

# Enable telemetry to get performance monitoring
gemini --telemetry

# Or via environment
export GEMINI_TELEMETRY_ENABLED=true

Performance metrics are collected alongside existing telemetry data and sent to the same OTLP endpoints.

Related Issues

Notes

This implementation provides the foundation for performance monitoring. Future enhancements could include real-time dashboards and automated alerting as discussed in #2127.

@eLyiN eLyiN requested a review from a team as a code owner June 27, 2025 08:24
@eLyiN eLyiN force-pushed the feature/enhanced-performance-monitoring branch 2 times, most recently from d580d01 to d396283 Compare June 28, 2025 06:24
@umairidris
Copy link
Copy Markdown
Contributor

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces performance monitoring and telemetry infrastructure. The review identified critical issues related to PII exposure and potential division-by-zero errors in metric calculations, as well as a high-severity issue related to double-counted metrics. Addressing these issues will improve the security and reliability of the new infrastructure.

Comment thread packages/cli/src/gemini.tsx Outdated
Comment thread packages/cli/src/gemini.tsx Outdated
Comment thread packages/core/src/telemetry/memory-monitor.ts Outdated
Comment thread packages/core/src/telemetry/metrics.ts Outdated
Comment thread packages/core/src/telemetry/metrics.ts Outdated
@eLyiN eLyiN force-pushed the feature/enhanced-performance-monitoring branch from 96bd439 to 5cf41c2 Compare June 29, 2025 19:00
@eLyiN
Copy link
Copy Markdown
Contributor Author

eLyiN commented Jun 29, 2025

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces performance monitoring and telemetry infrastructure. I've identified a couple of high-severity issues: potential for multiple memory monitors and division-by-zero in baseline comparisons. Addressing these will improve reliability.

Comment thread packages/core/src/telemetry/memory-monitor.ts Outdated
Comment thread packages/core/src/telemetry/metrics.ts Outdated
@eLyiN
Copy link
Copy Markdown
Contributor Author

eLyiN commented Jun 29, 2025

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR introduces a comprehensive performance monitoring and telemetry infrastructure. The changes are extensive and well-thought-out. I've provided a few comments on potential improvements, focusing on error handling to prevent silent failures and ensure the new telemetry is as clear and useful as possible. Overall, this is a great addition to the project.

Comment thread packages/core/src/telemetry/memory-monitor.ts Outdated
Comment thread packages/cli/src/gemini.tsx Outdated
@eLyiN
Copy link
Copy Markdown
Contributor Author

eLyiN commented Jun 29, 2025

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive performance monitoring and telemetry infrastructure, which is a great enhancement for diagnosing performance issues and memory leaks. The implementation is well-structured and extends the existing OpenTelemetry setup effectively.

I've identified one high-severity issue in the new MemoryMonitor class. The periodic timer for memory snapshots is not being 'unreferenced', which will prevent the CLI process from exiting cleanly when it should. I've provided a suggestion to resolve this. Otherwise, the changes look solid.

Comment thread packages/core/src/telemetry/memory-monitor.ts Outdated
@eLyiN
Copy link
Copy Markdown
Contributor Author

eLyiN commented Jun 29, 2025

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly enhances the CLI's performance monitoring and telemetry infrastructure by integrating detailed startup timing, comprehensive memory tracking, and new performance metrics. The changes are well-structured and align with the stated objectives of the PR.

Comment thread packages/core/src/telemetry/metrics.ts Outdated
@eLyiN
Copy link
Copy Markdown
Contributor Author

eLyiN commented Jun 29, 2025

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR introduces performance monitoring and telemetry. There's a missing newline in packages/core/src/telemetry/index.ts. The MemoryMonitor singleton in packages/core/src/telemetry/memory-monitor.ts stores a config object, which can cause issues with metric attribution in multi-session scenarios. It needs refactoring to avoid storing the config object.

Comment thread packages/core/src/telemetry/memory-monitor.ts Outdated
Comment thread packages/core/src/telemetry/index.ts Outdated
Copy link
Copy Markdown
Contributor

@jerop jerop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eLyiN thank you for making these improvements to OpenTelemetry

Please update the docs and add tests

@eLyiN eLyiN force-pushed the feature/enhanced-performance-monitoring branch from c34945c to 7742b60 Compare June 29, 2025 21:07
@eLyiN
Copy link
Copy Markdown
Contributor Author

eLyiN commented Jun 29, 2025

Hi @jerop, thank you for reviewing my pull request. I've made the requested changes and also fixed some TypeScript issues that were causing problems with the npm run preflight command. Ready for review.

@eLyiN eLyiN requested a review from jerop June 29, 2025 21:14
@eLyiN eLyiN force-pushed the feature/enhanced-performance-monitoring branch from 7742b60 to 6b59079 Compare June 30, 2025 14:26
@gemini-cli gemini-cli bot added kind/enhancement priority/p2 Important but can be addressed in a future release. labels Jul 3, 2025
@jerop
Copy link
Copy Markdown
Contributor

jerop commented Jul 8, 2025

@eLyiN apologies for the delay here, could you please rebase this change and will prioritize getting it in?

@eLyiN eLyiN force-pushed the feature/enhanced-performance-monitoring branch from 6b59079 to bc7c079 Compare July 8, 2025 16:28
@eLyiN
Copy link
Copy Markdown
Contributor Author

eLyiN commented Jul 8, 2025

@eLyiN apologies for the delay here, could you please rebase this change and will prioritize getting it in?

@jerop No worries! Ready for merge!

@jerop jerop force-pushed the feature/enhanced-performance-monitoring branch from 3065e0f to 67a596a Compare July 10, 2025 20:23
Copy link
Copy Markdown
Contributor

@jerop jerop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting this error when I try this out:

INT value type cannot accept a floating-point value for gemini_cli.startup.duration, ignoring the  fractional digits.

It looks like the recordStartupPerformance function in packages/core/src/telemetry/metrics.ts is receiving a floating-point number for the duration, but the underlying metric startupTimeHistogram is configured to only accept integers.

eLyiN added 12 commits September 9, 2025 09:40
- Update telemetry.md with comprehensive activity monitoring docs
- Document high water mark tracking and rate limiting
- Add configuration examples and performance impact details
- Explain activity-driven monitoring architecture
- Fix recordActivity function to properly forward arguments to ActivityMonitor
- Update test implementations to match actual function signatures
- Correct import paths for different activity recording use cases
- Ensure activity-driven memory monitoring works as designed

This restores the intended behavior of the activity monitoring system.
- Introduce trackStartupPerformance wrapper function to reduce verbosity
- Replace repetitive start/end timing patterns with more terse API
- Add Chrome DevTools integration for development builds
- Update settings_sources from hardcoded 2 to correct value 3

Improves code maintainability and developer experience.
- Remove smoothing logic and weighted averaging from memory tracking
- Use current values directly for threshold comparison instead of smoothed readings
- Eliminate recentReadings buffer and related complexity
- Simplify constructor to only accept growth threshold parameter

Results in cleaner, more predictable memory monitoring behavior.
- Remove unnecessary comment from monitoring interval configuration
- Update memory monitor constructor to match simplified high-water mark tracker
- Fix minor formatting in telemetry documentation

Improves code clarity and maintains consistency across telemetry components.
- Update useActivityMonitoring tests to use proper mock setup with shared references
- Fix test expectations to match new ActivityMonitor.recordActivity implementation
- Remove obsolete getGrowthPercentage tests after smoothing removal
- Update high-water mark tracker constructor calls to match simplified API

Ensures all tests pass with the updated activity monitoring architecture.
- Fix TypeScript strict mode violations in test files
- Ensure all test imports and function signatures are correct
- Clean up unused variables and maintain code quality standards

All preflight checks now pass with 3,448 tests successfully running.
- Remove extra blank line in memory-monitor.test.ts after line 464
- Remove extra blank line in memory-monitor.ts after line 285
- Ensures code passes CI formatter check (git diff --exit-code)
- Fixed parseArguments call in gemini.tsx to use correct parameter
- Resolved TypeScript import type issues for consistent-type-imports rule
- Updated import/export patterns to separate type and value imports
- Maintained all performance monitoring and telemetry functionality
- All tests passing and linting clean
…rics

- Update authentication type handling to ensure 'unset' is recorded when no type is selected
- Modify theme name handling to default to 'unset' if no theme is specified
- Clean up commented code in fileUtils for better clarity

This improves the accuracy of performance monitoring data during startup.
- Added CURSOR_TRACE_ID environment variable stubs in multiple tests to improve IDE detection accuracy.
- Updated gemini.test.tsx to include mock implementations for getFileService and getCheckpointingEnabled.
- Refactored activity-monitor tests to import ActivityType from the correct module, ensuring consistency in activity tracking.

These changes improve the robustness of the testing framework and enhance the accuracy of IDE detection logic.
auto-merge was automatically disabled September 9, 2025 07:40

Head branch was pushed to by a user without write access

@eLyiN eLyiN force-pushed the feature/enhanced-performance-monitoring branch from b560033 to 06875c6 Compare September 9, 2025 07:40
@eLyiN
Copy link
Copy Markdown
Contributor Author

eLyiN commented Sep 9, 2025

up!

@jacob314
Copy link
Copy Markdown
Contributor

jacob314 commented Sep 9, 2025

Hope we can this landed soon. It is needed for #7329.

jerop
jerop previously approved these changes Sep 9, 2025
@jerop
Copy link
Copy Markdown
Contributor

jerop commented Sep 9, 2025

@jacob314 @eLyiN apologies for the delay, removed the blocker from my "request changes"

@eLyiN
Copy link
Copy Markdown
Contributor Author

eLyiN commented Sep 9, 2025

Don’t worry, @jerop! I pushed some fixes to make sure npm ci passes again. It took quite a bit of continuous rebasing, but everything looks fine now.

@jerop
Copy link
Copy Markdown
Contributor

jerop commented Sep 9, 2025

@eLyiN any chance you can split this PR into smaller changes that we can merge at a time? this is 4800+ line change in one PR

@eLyiN
Copy link
Copy Markdown
Contributor Author

eLyiN commented Sep 9, 2025

@jerop @jacob314 I’ve split the original #2157 PR into 6 smaller, thematic PRs. Each PR builds and tests independently, with exports gated behind isPerformanceMonitoringActive so there are no runtime behaviour changes until later stages. This gives clear scope visibility, a clean dependency chain, and manageable review chunks.

For tracking, I’ve created issue #8120 to follow the overall progress.

We’re starting with PR #8110, which is ready for review. The remaining branches are prepared and will be opened as the sequence progresses.

@jacob314
Copy link
Copy Markdown
Contributor

jacob314 commented Sep 9, 2025

Thank you for splitting this into multiple PRs! Closing this now to keep my list of prs I am reviewing accurate.

@jacob314 jacob314 closed this Sep 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority/p2 Important but can be addressed in a future release.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhanced Performance Monitoring: Startup instrumentation, memory tracking, and detailed metrics collection

8 participants