Skip to content

Add git reset validation script and fix missing resets#425

Draft
juanmichelini wants to merge 1 commit intomainfrom
openhands/validate-git-reset-424
Draft

Add git reset validation script and fix missing resets#425
juanmichelini wants to merge 1 commit intomainfrom
openhands/validate-git-reset-424

Conversation

@juanmichelini
Copy link
Collaborator

Summary

This PR adds a validation script that ensures git reset is called after every git clone or git checkout operation in the benchmark codebase. This prevents agents from looking at commits that are not part of the benchmark during evaluation.

Changes

New Files

  • benchmarks/scripts/validate_git_reset.py - Validation script that:
    • Scans Python and shell files for git clone and git checkout commands
    • Verifies that git reset follows within a configurable context window
    • Accepts git reset in comments as valid (to allow explicit documentation)
    • Skips git checkout -b as it creates new branches and doesn't need reset
  • tests/test_validate_git_reset.py - Test suite with 14 tests covering all functionality

Modified Files

Fixed 6 files with missing git reset operations:

  • benchmarks/commit0/run_infer.py - Added git reset --hard HEAD after clone
  • legacy/ml_bench/run_infer.py - Added git reset --hard HEAD after clone
  • legacy/lca_ci_build_repair/run_infer.py - Added git reset --hard HEAD after switch
  • legacy/lca_ci_build_repair/eval_infer.py - Added git reset --hard HEAD after switch
  • legacy/swefficiency/scripts/setup/prepare_swe_utils.sh - Added git reset --hard HEAD
  • legacy/testgeneval/scripts/setup/prepare_swe_utils.sh - Added git reset --hard HEAD

Added comments documenting why git reset is not needed:

  • legacy/utils/version_control.sh - This file handles version control for the development environment, not benchmark evaluation

Configuration

  • Added validate-git-reset to .pre-commit-config.yaml for automatic validation
  • Added validate-git-reset CLI entry point in pyproject.toml

Usage

Run validation manually:

validate-git-reset .
# or
python benchmarks/scripts/validate_git_reset.py .

If a git clone or git checkout is intentionally not followed by git reset, add a comment:

# git reset is not needed here because this is a development utility
git checkout main

Testing

All 14 tests pass:

uv run pytest tests/test_validate_git_reset.py -v

Fixes #424

@juanmichelini can click here to continue refining the PR

This PR adds a validation script that ensures git reset is called after
every git clone or git checkout operation in the benchmark codebase.
This prevents agents from looking at commits that are not part of the
benchmark during evaluation.

Changes:
- Add validate_git_reset.py script to check for git reset after clone/checkout
- Add pre-commit hook for automatic validation
- Fix 6 files with missing git reset operations:
  - benchmarks/commit0/run_infer.py
  - legacy/ml_bench/run_infer.py
  - legacy/lca_ci_build_repair/run_infer.py
  - legacy/lca_ci_build_repair/eval_infer.py
  - legacy/swefficiency/scripts/setup/prepare_swe_utils.sh
  - legacy/testgeneval/scripts/setup/prepare_swe_utils.sh
- Add comments for version_control.sh where git reset is not needed
- Add test suite for the validation script (14 tests)
- Add CLI entry point: validate-git-reset

The validation accepts 'git reset' in comments as valid, allowing
explicit documentation when reset is intentionally skipped.

Fixes #424

Co-authored-by: openhands <openhands@all-hands.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

make sure all git clone have git reset

2 participants

Comments