Add git reset validation script and fix missing resets by juanmichelini · Pull Request #425 · OpenHands/benchmarks

juanmichelini · 2026-02-17T17:46:56Z

Summary

This PR adds a validation script that ensures git reset is called after every git clone or git checkout operation in the benchmark codebase. This prevents agents from looking at commits that are not part of the benchmark during evaluation.

Changes

New Files

benchmarks/scripts/validate_git_reset.py - Validation script that:
- Scans Python and shell files for git clone and git checkout commands
- Verifies that git reset follows within a configurable context window
- Accepts git reset in comments as valid (to allow explicit documentation)
- Skips git checkout -b as it creates new branches and doesn't need reset
tests/test_validate_git_reset.py - Test suite with 14 tests covering all functionality

Modified Files

Fixed 6 files with missing git reset operations:

benchmarks/commit0/run_infer.py - Added git reset --hard HEAD after clone
legacy/ml_bench/run_infer.py - Added git reset --hard HEAD after clone
legacy/lca_ci_build_repair/run_infer.py - Added git reset --hard HEAD after switch
legacy/lca_ci_build_repair/eval_infer.py - Added git reset --hard HEAD after switch
legacy/swefficiency/scripts/setup/prepare_swe_utils.sh - Added git reset --hard HEAD
legacy/testgeneval/scripts/setup/prepare_swe_utils.sh - Added git reset --hard HEAD

Added comments documenting why git reset is not needed:

legacy/utils/version_control.sh - This file handles version control for the development environment, not benchmark evaluation

Configuration

Added validate-git-reset to .pre-commit-config.yaml for automatic validation
Added validate-git-reset CLI entry point in pyproject.toml

Usage

Run validation manually:

validate-git-reset .
# or
python benchmarks/scripts/validate_git_reset.py .

If a git clone or git checkout is intentionally not followed by git reset, add a comment:

# git reset is not needed here because this is a development utility
git checkout main

Testing

All 14 tests pass:

uv run pytest tests/test_validate_git_reset.py -v

Fixes #424

@juanmichelini can click here to continue refining the PR

This PR adds a validation script that ensures git reset is called after every git clone or git checkout operation in the benchmark codebase. This prevents agents from looking at commits that are not part of the benchmark during evaluation. Changes: - Add validate_git_reset.py script to check for git reset after clone/checkout - Add pre-commit hook for automatic validation - Fix 6 files with missing git reset operations: - benchmarks/commit0/run_infer.py - legacy/ml_bench/run_infer.py - legacy/lca_ci_build_repair/run_infer.py - legacy/lca_ci_build_repair/eval_infer.py - legacy/swefficiency/scripts/setup/prepare_swe_utils.sh - legacy/testgeneval/scripts/setup/prepare_swe_utils.sh - Add comments for version_control.sh where git reset is not needed - Add test suite for the validation script (14 tests) - Add CLI entry point: validate-git-reset The validation accepts 'git reset' in comments as valid, allowing explicit documentation when reset is intentionally skipped. Fixes #424 Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai bot mentioned this pull request Feb 17, 2026

make sure all git clone have git reset #424

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add git reset validation script and fix missing resets#425

Add git reset validation script and fix missing resets#425
juanmichelini wants to merge 1 commit intomainfrom
openhands/validate-git-reset-424

juanmichelini commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

juanmichelini commented Feb 17, 2026

Summary

Changes

New Files

Modified Files

Configuration

Usage

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments