Add per-eval Docker image override via evals.image property#2153
Merged
dgageot merged 1 commit intodocker:mainfrom Mar 18, 2026
Merged
Conversation
Allow each eval JSON to specify a custom Docker image through the "image" field in the "evals" object, overriding the global --base-image flag. The image build cache key now includes both workingDir and image to correctly handle different images for the same working directory. Assisted-By: docker-agent
gtardif
approved these changes
Mar 18, 2026
There was a problem hiding this comment.
Assessment: 🟢 APPROVE
This PR adds per-eval Docker image override functionality through the evals.image property. The implementation is sound:
Key Changes:
- Introduced
imageKeystruct combiningworkingDirandimagefor cache keys - Modified
getOrBuildImageto accept*session.EvalCriteriainstead of justworkingDir - Added
resolveBaseImageto prioritize per-eval image over global--base-imageflag - Updated
preBuildImagesto handle the new cache key structure - Simplified Dockerfile template (removed docker-in-docker setup)
Verification Results:
All potential issues were investigated and dismissed:
- Zero-valued
imageKeywheneval.Evalsis nil is intentional and correct (all evals without custom config should share the same base image) - Singleflight correctly deduplicates concurrent builds and doesn't cache errors indefinitely
- All nil pointer access paths are guarded by callers
The cache key design correctly handles the case where multiple evals need the same image configuration, and the singleflight integration prevents redundant concurrent builds.
No issues found in the changed code.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Allow each eval JSON to specify a custom Docker image through the
"image"field in the"evals"object, overriding the global--base-imageflag. The image build cache key now includes both workingDir and image
to correctly handle different images for the same working directory.
Assisted-By: docker-agent