Limit recursion when looking for .gitignore files#8103
Limit recursion when looking for .gitignore files#8103bbiggs merged 3 commits intogoogle-gemini:mainfrom
Conversation
There was a problem hiding this comment.
Summary of Changes
Hello @bbiggs, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request addresses a performance bottleneck during application startup in large codebases. By introducing a controlled, iterative search for .gitignore files and limiting the number of directories scanned, it significantly reduces the time spent on file system traversal, leading to faster initialization of the gemini cli.
Highlights
- Performance Improvement: Implemented a limit on the number of directories scanned when searching for .gitignore files to prevent slow startup times in large repositories.
- Refactored Gitignore Parsing: Changed the .gitignore file discovery mechanism from a recursive approach to an iterative, queue-based (BFS) method, allowing for better control over the search depth and scanned directories.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request refactors the recursive search for .gitignore files into an iterative, breadth-first search to improve performance and prevent stack overflows in large repositories. A limit on the number of scanned directories is also introduced. While this is a solid improvement, I've identified a potential high-severity issue where the new implementation could lead to excessive memory usage in repositories with very wide directory structures. My review includes a specific code suggestion to address this potential memory issue by capping the traversal queue size.
| if (entry.isDirectory()) { | ||
| queue.push(path.join(dir, entry.name)); | ||
| } |
There was a problem hiding this comment.
The current implementation of the breadth-first search can lead to excessive memory consumption in repositories with directories containing a very large number of subdirectories. The queue array can grow to a very large size before the scannedDirs < this.maxScannedDirs condition in the while loop is met, as all subdirectories of a given directory are added to the queue in one go. This could introduce a memory-related performance issue, which is contrary to the goal of this pull request.
To prevent this, I suggest capping the size of the queue to avoid it growing uncontrollably. By stopping the addition of new directories to the queue when its size approaches maxScannedDirs, we can effectively limit memory usage.
if (entry.isDirectory()) {
if (queue.length >= this.maxScannedDirs) {
break;
}
queue.push(path.join(dir, entry.name));
}
cornmander
left a comment
There was a problem hiding this comment.
Create an issue for adding tests in #8104
Co-authored-by: cornmander <shikhman@google.com>
Co-authored-by: cornmander <shikhman@google.com>
Co-authored-by: cornmander <shikhman@google.com>
TLDR
Limits recursion when searching for .gitignore files.
Dive Deeper
On startup in large trees, searching recursively for .gitignore files prevented gemini cli from starting quickly.
Linked issues / bugs
Fixes #8099