common: Add REVIEW.md with code review guidelines by cgwalters · Pull Request #63 · bootc-dev/infra

cgwalters · 2025-12-19T14:56:38Z

Add a REVIEW.md file capturing code review expectations derived from analysis of PR feedback across the bootc-dev organization. This includes guidance on testing, code quality, commit organization, architecture decisions, and Rust-specific patterns.

Update AGENTS.md to reference REVIEW.md, encouraging AI agents to spawn review subagents using these guidelines after completing work.

Assisted-by: OpenCode (Opus 4.5)

Add a REVIEW.md file capturing code review expectations derived from analysis of PR feedback across the bootc-dev organization. This includes guidance on testing, code quality, commit organization, architecture decisions, and Rust-specific patterns. Update AGENTS.md to reference REVIEW.md, encouraging AI agents to spawn review subagents using these guidelines after completing work. Assisted-by: OpenCode (Sonnet 4) Signed-off-by: Colin Walters <walters@verbum.org>

gemini-code-assist · 2025-12-19T14:56:55Z

Summary of Changes

Hello @cgwalters, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a foundational document, REVIEW.md, to standardize and formalize the code review process within the bootc-dev organization. By consolidating best practices and expectations, it aims to improve code quality, consistency, and the efficiency of reviews. Additionally, it integrates these new guidelines into the AGENTS.md document, promoting their adoption by AI-assisted development workflows.

Highlights

New Code Review Guidelines: A new REVIEW.md file has been added, establishing comprehensive code review expectations and standards derived from past PR feedback across the bootc-dev organization.
Updated AI Agent Guidelines: The AGENTS.md file has been updated to reference the new REVIEW.md, encouraging AI agents to utilize these guidelines for self-review and when reviewing other code.
Comprehensive Coverage: The guidelines cover a wide range of topics including testing methodologies, code quality standards, commit and pull request organization, architectural decisions, and Rust-specific best practices.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a comprehensive REVIEW.md file with code review guidelines and updates AGENTS.md to reference it. The guidelines are well-structured, clear, and cover important aspects of software development from testing to architecture. This is a valuable addition for promoting consistency and quality. I've included a couple of minor grammatical suggestions for the documentation.

cgwalters · 2025-12-19T15:04:26Z

One thing I am strongly considering is dropping Gemini Code Assist from our repositories, and going to a model where a human initiates review using our common guidelines, using their own tools (and AI budget).

I think an important thing if we do this though is to ensure humans can easily ingest an AI-generated review that is targeting the GH Checks API and tweak it where necessary. This needs some careful consideration - we should make it clear when the human is effectively just passing through AI content versus having performed some edits. Maybe the review "header" has Assisted-by: ToolName (ModelName), and each comment starts out with an AI: prefix by default?

Then if they do any editing or research, they can drop that prefix.

(Again, the "using the GH Checks API" is I think quite important for large PRs to ensure the comments are inline next to the relevant code)

cgwalters · 2025-12-19T15:08:44Z

I gave this prompt to OpenCode+Opus 4.5:

look at the tip commit. here's our goal: I want you to spawn a subagent to check out a representative sample of 20 recent pull requests, and perform a review locally using these guidelines and then compare it with what the review actually said! I am now especially interested in cases where Gemini Code Review spotted things you didn't - let's see if we can figure out why/how it did that and how we can refine our guidelines! To check out code, do so in a new worktree

Plus some more iteration, it came up with this summary...and my conclusion is we could likely get by with AI-assisted human review by default, and don't need the Gemini Code Review app.

AI generated content follows

Gemini Code Review vs OpenCode (Opus): Comparative Analysis

This document summarizes findings from analyzing 20+ recent PRs in the bootc-dev/bootc repository to compare Gemini Code Review's catches against what our REVIEW.md guidelines can detect.

Executive Summary

Our REVIEW.md guidelines are sufficient to catch all high-value issues Gemini found. The gap is not in the guidelines themselves, but in consistent application. When subagents were explicitly asked to apply the guidelines to specific code, they found 5/5 issues that Gemini had caught and received +1 reactions from maintainers.

Methodology

Collected all Gemini Code Review comments from ~100 recent PRs
Filtered for comments that received +1 reactions from human reviewers (signal for high-quality catches)
Found 17 such comments across the repository
Selected 5 representative high-value catches for replication testing
Spawned subagents with ONLY our REVIEW.md guidelines (no knowledge of what Gemini found)
Compared results

High-Value Gemini Catches That Received +1 Reactions

Categories of Validated Catches

Category	Count	Examples
Logic/Correctness Bugs	4	Off-by-one error, missing `=` in format string, shell conditional bug
Ignored/Swallowed Errors	2	`gpt_workaround()` result ignored, error details discarded
Dead/Redundant Code	3	Redundant Containerfile, redundant function call, unused import
Code Duplication/Constants	2	Duplicated conditional in spec file, hardcoded string vs constant
Idiomatic Improvements	3	Simplified filter, cleaner function signatures
Documentation Mismatch	1	Implementation didn't match documentation
Test Bugs	1	Test checking wrong variable
Undocumented Behavior Changes	1	Mount type default change not in PR description

The 5 Cases Selected for Replication

PR	Issue	Severity	Who +1'd
#1502	Missing `=` in format string `{COMPOSEFS_CMDLINE}{id_hex}`	Critical	cgwalters
#1715	Off-by-one: `idx > len` should be `idx >= len`	High	Johan-Liebert1
#1550	`gpt_workaround()` result silently ignored	High	Johan-Liebert1
#1500	Shell `test -n "${initramfs:-}"` true for `"0"`	High	cgwalters
#1556	Mount type default changed without PR description mention	Medium	Johan-Liebert1

Replication Results

Can Our Guidelines Catch These Issues?

PR	Gemini's Catch	Subagent Found It?	Relevant Guideline
#1502	Missing `=` in format string	YES	Careful review of refactoring changes
#1715	Off-by-one array bounds	YES	General code quality / edge cases
#1550	Ignored `Result`	YES	"Don't swallow errors silently"
#1500	Shell conditional logic bug	YES	Shell script scrutiny
#1556	Undocumented behavioral change	YES	"Document limitations/caveats in PR"

Result: 5/5 (100%) - All issues were independently found by subagents using only REVIEW.md guidelines.

Subagent Findings Detail

PR #1502: Missing `=` in Format String

Gemini's catch: The refactoring dropped the = sign, changing composefs={id_hex} to composefs{id_hex}.

Subagent's finding:

"The refactoring introduced a bug by dropping the = sign. The original code produced composefs=<hash>, but the refactored code now produces composefs<hash> (missing the =). This would result in a malformed kernel command line parameter."

PR #1715: Off-by-One Error

Gemini's catch: idx > manifest.layers().len() should be >= to prevent panic.

Subagent's finding:

"Array indices are 0-based, so if manifest.layers().len() is 5, valid indices are 0-4. The check idx > manifest.layers().len() would only trigger when idx > 5, but accessing manifest.layers()[5] would already be out of bounds."

PR #1550: Ignored Result

Gemini's catch: gpt_workaround(); ignores the Result<()> return value.

Subagent's finding:

"The gpt_workaround() function returns Result<()>, but the return value is completely ignored. If this function fails, the error is silently discarded. This violates 'Don't swallow errors silently - propagate them with context'."

PR #1500: Shell Conditional Bug

Gemini's catch: test -n "${initramfs:-}" is true for any non-empty value including "0".

Subagent's finding:

"The ARG initramfs=0 sets the variable to the string "0", but test -n tests if the variable is non-empty. Since "0" is a non-empty string, this condition will be true even when initramfs=0 (disabled)."

PR #1556: Undocumented Behavioral Change

Gemini's catch: Default mount type for /etc changed from Overlay to Bind without mention in PR description.

Subagent's finding:

"The default mount type for /etc changed from MountType::Overlay to MountType::Bind, but this is not mentioned in the PR description which 'focuses on error handling improvements.' This is a significant behavioral change that affects how /etc is mounted by default."

Gemini's Strengths

Based on the +1'd comments, Gemini excels at:

Format string validation - Catching missing characters in format strings, especially in kernel command lines
Boundary conditions - Off-by-one errors in array/slice access
Dead/redundant code detection - Files, imports, and code blocks that became stale
Shell script semantics - Understanding bash conditional edge cases
Cross-referencing - Comparing new code vs old code, docs vs implementation
PR description accuracy - Noticing when behavioral changes aren't documented

Gemini's Weaknesses

From analyzing comments without +1 reactions:

Rust-specific false positives - Occasionally flagged lifetime/borrow issues that the compiler would catch
Less thorough on CI/YAML - PR #1860 (CI workflow) received no feedback
Debatable micro-optimizations - Some suggestions like removing BufWriter were questionable

Key Insights

Why Our Guidelines Work

The REVIEW.md guidelines successfully cover all the high-value catches because:

Guideline	What It Catches
"Don't swallow errors silently"	Ignored Results, missing `?` operators
"Extract magic numbers/strings into constants"	Forces attention on format strings during refactoring
"Handle edge cases explicitly"	Boundary conditions, off-by-one errors
"Document limitations and caveats"	Undocumented behavioral changes
"Shell scripts should be under 50 lines" + scrutiny	Shell conditional bugs

The Real Gap

The issue is not what our guidelines say, but consistent application:

Gemini checks every line, every time - Humans and AI reviewers can miss things when not focused
Gemini is particularly attentive to string manipulation - Format strings, paths, configuration values
Gemini systematically compares old vs new - Easier to spot when refactoring drops something

Recommendations

No Major Guideline Changes Needed

The current REVIEW.md is sufficient. However, consider adding explicit callouts for:

Format string verification - "When refactoring code that constructs strings (especially command lines, paths, URLs), verify all delimiters and separators are preserved"
Boundary condition checklist - "For array/slice access, verify: > vs >=, < vs <=, off-by-one in loops"
Shell boolean patterns - "For boolean-like shell variables, use explicit value comparison (= "1") rather than empty/non-empty tests (-n)"

Process Recommendations

Use Gemini as a complement, not replacement - It catches things humans miss through fatigue
Trust +1'd Gemini comments - When maintainers validate a catch, it's worth learning from
Gemini is less reliable for Rust-specific issues - The compiler handles most of those anyway

Conclusion

Our REVIEW.md guidelines are comprehensive enough to catch all the high-value issues Gemini found. The value Gemini provides is in consistent, tireless application of review principles to every line of every PR. When our guidelines are explicitly and carefully applied (as the subagents did), they produce equivalent results.

The best approach is to use both: Gemini for consistent coverage, human/AI reviewers for deeper architectural and design feedback that requires context beyond the immediate diff.

gursewak1997

lgtm

gemini-code-assist Bot reviewed Dec 19, 2025

View reviewed changes

Comment thread common/AGENTS.md

Comment thread common/REVIEW.md

cgwalters enabled auto-merge (squash) December 19, 2025 15:04

cgwalters mentioned this pull request Dec 19, 2025

common/tasks: New dir, add AI-driven forge review workflow] #64

Closed

gursewak1997 approved these changes Dec 19, 2025

View reviewed changes

cgwalters merged commit d5a5a62 into bootc-dev:main Dec 19, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

common: Add REVIEW.md with code review guidelines#63

common: Add REVIEW.md with code review guidelines#63
cgwalters merged 1 commit intobootc-dev:mainfrom
cgwalters:agents-to-review

cgwalters commented Dec 19, 2025

Uh oh!

gemini-code-assist Bot commented Dec 19, 2025

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

cgwalters commented Dec 19, 2025

Uh oh!

cgwalters commented Dec 19, 2025

Uh oh!

gursewak1997 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cgwalters commented Dec 19, 2025

Uh oh!

gemini-code-assist Bot commented Dec 19, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

cgwalters commented Dec 19, 2025

Uh oh!

cgwalters commented Dec 19, 2025

Gemini Code Review vs OpenCode (Opus): Comparative Analysis

Executive Summary

Methodology

High-Value Gemini Catches That Received +1 Reactions

Categories of Validated Catches

The 5 Cases Selected for Replication

Replication Results

Can Our Guidelines Catch These Issues?

Subagent Findings Detail

PR #1502: Missing = in Format String

PR #1715: Off-by-One Error

PR #1550: Ignored Result

PR #1500: Shell Conditional Bug

PR #1556: Undocumented Behavioral Change

Gemini's Strengths

Gemini's Weaknesses

Key Insights

Why Our Guidelines Work

The Real Gap

Recommendations

No Major Guideline Changes Needed

Process Recommendations

Conclusion

Uh oh!

gursewak1997 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

PR #1502: Missing `=` in Format String