Skip to content

fix: benchmark script regex + workflow branch naming#124

Merged
carlos-alm merged 3 commits intomainfrom
fix/benchmark-regex-h3
Feb 26, 2026
Merged

fix: benchmark script regex + workflow branch naming#124
carlos-alm merged 3 commits intomainfrom
fix/benchmark-regex-h3

Conversation

@carlos-alm
Copy link
Contributor

Summary

Fixes two issues flagged by Greptile on #122:

  • Regex bug in scripts/update-benchmark-report.js: The performance section regex matched from ## Performance all the way to the next ## heading, swallowing ### Lightweight Footprint subsection. Changed lookahead from (?=\r?\n## (?!#)|$) to (?=\r?\n#{2,3} |$) so it stops at h3 headings too. Restored the deleted subsection in README.
  • Workflow branch naming: All four benchmark jobs created branches with benchmark/ prefix which doesn't match the required convention. Changed to docs/benchmark-*.

Supersedes #122.

Test plan

  • Verified new regex matches only the table content, not the ### Lightweight Footprint subsection
  • Verified replacement preserves both the subsection and the next ## heading

The performance section regex in update-benchmark-report.js was matching
from ## Performance all the way to the next ## heading, swallowing any
### subsections (like Lightweight Footprint) in between. Changed the
lookahead to also stop at h3 headings and restored the deleted section.
The benchmark workflow created branches with benchmark/ prefix which
doesn't match the required branch naming convention (feat/, fix/, docs/,
etc.). Changed all four jobs to use docs/benchmark-* instead.
@claude
Copy link

claude bot commented Feb 26, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Benchmark Script Regex + Workflow Branch Naming

Todo List:

  • Read changed files to understand modifications
  • Analyze regex fix in update-benchmark-report.js
  • Review workflow branch naming changes
  • Apply 4-phase review framework
  • Provide final rating and recommendation

Root Cause Analysis

Problem Being Solved: Two distinct bugs flagged by Greptile on #122:

  1. Regex overconsumption: The old pattern (?=\r?\n## (?!#)|$) greedily matched from ## 📊 Performance to the next ## heading, swallowing the ### Lightweight Footprint h3 subsection
  2. Branch naming mismatch: All four workflow jobs used benchmark/ prefix instead of required docs/ convention

Why This Approach: Both fixes are precise and surgical:

  • Regex fix changes lookahead from (?=\r?\n## (?!#)|$) to (?=\r?\n#{2,3} |$) to stop at both h2 (##) and h3 (###) headers
  • Branch naming systematically changes benchmark/docs/benchmark- across all four jobs

Risk Assessment: LOW RISK - Both are targeted fixes addressing specific reported bugs.

Critical Concerns

None identified. This is a clean bug fix addressing legitimate issues.

Technical Analysis

Regex Fix ✅ EXCELLENT

Before: /## 📊 Performance\r?\n[\s\S]*?(?=\r?\n## (?!#)|$)/

  • Problem: (?!#) negative lookahead failed - ## (?!#) matches ### Lightweight because after ## there's a space, not #
  • Result: Consumed everything until next ## heading, deleting the h3 subsection

After: /## 📊 Performance\r?\n[\s\S]*?(?=\r?\n#{2,3} |$)/

  • Solution: #{2,3} explicitly matches both ## and ### patterns
  • Result: Stops at h3 headings, preserving subsections

The fix is technically correct and precisely targets the root cause.

Workflow Changes ✅ CORRECT

All four jobs consistently changed benchmark/docs/benchmark-:

  • build-benchmark: docs/benchmark-build-$(date)
  • embedding-benchmark: docs/benchmark-embedding-$(date)
  • query-benchmark: docs/benchmark-query-$(date)
  • incremental-benchmark: docs/benchmark-incremental-$(date)

Branch naming now aligns with repository conventions.

File Updates ✅ APPROPRIATE

  • scripts/update-benchmark-report.js: Core regex fix + updated comments
  • .github/workflows/benchmark.yml: Systematic branch naming fix
  • README.md: Performance metrics updated (normal benchmark refresh)
  • generated/BUILD-BENCHMARKS.md: Benchmark data updated (normal refresh)

The README and BUILD-BENCHMARKS changes appear to be from normal benchmark runs, not manual edits.

Code Quality Assessment

Strengths

  1. Precise targeting: Both fixes address exact root causes without side effects
  2. Clear documentation: Updated regex comments explain the h2/h3 behavior
  3. Systematic consistency: All four workflow jobs updated uniformly
  4. Test validation: PR author verified regex matches only table content
  5. No regression risk: Changes are isolated and well-scoped

Verification

  • Regex pattern #{2,3} correctly matches both ## and ###
  • Branch naming follows consistent docs/benchmark-* pattern
  • No configuration rules relaxed or tests weakened
  • No unrelated changes mixed in

Final Recommendation

## Root Cause Analysis
**Problem Being Solved**: Regex bug consuming h3 subsections + incorrect workflow branch naming
**Why This Approach**: Surgical fixes targeting exact root causes with minimal change surface
**Risk Assessment**: LOW - Isolated, well-tested changes with clear intent

## Critical Concerns
None. This is a clean, focused bug fix.

## Final Recommendation
- Rating: ⭐⭐⭐⭐☆ (4/5)
- Action: **APPROVE** 
- Reasoning: Excellent technical solution addressing real bugs. Precise regex fix with clear improvement, systematic workflow updates, and thorough validation by author.

This PR demonstrates best practices: surgical changes, clear problem identification, and systematic consistency across related files.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 26, 2026

Greptile Summary

This PR fixes two issues in the benchmark automation: a regex bug that was deleting the "Lightweight Footprint" subsection from README, and incorrect branch naming that didn't follow the repository's docs/ convention.

  • Regex fix (scripts/update-benchmark-report.js:220): Changed lookahead from (?=\r?\n## (?!#)|$) to (?=\r?\n#{2,3} |$) so the replacement stops at both h2 (##) and h3 (###) headers, preventing the script from swallowing h3 subsections
  • Branch naming (.github/workflows/benchmark.yml): All four benchmark jobs now use docs/benchmark-* instead of benchmark/*, consistent with repository conventions seen in other doc branches
  • Benchmark data regenerated: README.md and BUILD-BENCHMARKS.md were updated with v2.4.0 metrics (109 files, up from 99) with the Lightweight Footprint section correctly preserved

Both fixes address root causes flagged in #122. The regex fix prevents the bug that was manually patched in commit 744f95a from recurring.

Confidence Score: 5/5

  • This PR is safe to merge with no risk
  • Both fixes are surgical and correct: the regex change is a precise lookahead pattern fix with clear before/after behavior, and the branch naming is a straightforward string replacement aligning with documented conventions. The test plan confirms proper preservation of the h3 subsection, and regenerated benchmark data is consistent across all files.
  • No files require special attention

Important Files Changed

Filename Overview
scripts/update-benchmark-report.js Fixed regex to stop at h3 headers, preventing subsection deletion
.github/workflows/benchmark.yml Changed branch naming from benchmark/ to docs/benchmark-* to match conventions

Last reviewed commit: 7f102f6

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@carlos-alm carlos-alm merged commit 53fc34f into main Feb 26, 2026
16 of 17 checks passed
@carlos-alm carlos-alm deleted the fix/benchmark-regex-h3 branch February 26, 2026 08:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant