-
Notifications
You must be signed in to change notification settings - Fork 0
fix: prevent starting container if one already exists for check #189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add does_check_container_exist() to verification.py that checks for any container (running or stopped) with the check's label. Called at the start of _start_container() to abort if a duplicate would be created. This prevents race conditions where check_process_job runs multiple times for the same check (e.g., Celery retry). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
WalkthroughThese changes add a duplicate container detection mechanism by introducing a new function that checks for existing Docker containers (running or stopped) for a given check, and integrating it as a pre-check in the container startup process to prevent duplicate container creation. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds duplicate container detection to prevent race conditions where multiple Celery task retries could create multiple Docker containers for the same manufacturability check.
- Introduces
does_check_container_exist()function to query for any existing containers (running or stopped) with the check's label - Adds a safety check in
_start_container()to abort container creation if a duplicate would be created - Leverages existing exception handling by raising
ValueErrorwhen duplicates are detected
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| wafer_space/projects/verification.py | Adds new does_check_container_exist() function to check for any existing containers (running or stopped) with the check's label, preventing duplicate container creation |
| wafer_space/projects/tasks.py | Imports and calls the new duplicate check function at the start of _start_container(), raising ValueError if a container already exists for the check |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if does_check_container_exist(context.check) is True: | ||
| msg = f"Container already exists for check {context.check.id}" |
Copilot
AI
Dec 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check for is True allows the function to proceed when Docker is unavailable (None return). This means duplicate containers could still be created if Docker connection temporarily fails during the check. Consider either:
- Treating
Noneas a failure case:if does_check_container_exist(context.check) is not False: - Or raising an error when Docker is unavailable to prevent any container creation without verification
This ensures containers are only created when we can definitively verify no duplicates exist.
| if does_check_container_exist(context.check) is True: | |
| msg = f"Container already exists for check {context.check.id}" | |
| exists = does_check_container_exist(context.check) | |
| if exists is not False: | |
| msg = ( | |
| f"Cannot verify container does not already exist for check {context.check.id} (result: {exists})" | |
| ) |
| # Safety check: prevent duplicate containers | ||
| if does_check_container_exist(context.check) is True: | ||
| msg = f"Container already exists for check {context.check.id}" | ||
| raise ValueError(msg) |
Copilot
AI
Dec 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this check prevents some duplicates, a race condition still exists between checking for containers (line 543) and creating one (later in the function). Two simultaneous calls could both pass the check and create duplicate containers. Consider using Docker labels with a unique constraint or implementing distributed locking (e.g., Redis lock with the check_id as the key) to ensure atomicity between check and create operations.
| def does_check_container_exist(check: ManufacturabilityCheck) -> bool | None: | ||
| """Check if any Docker container exists for this check (running or stopped). | ||
| This prevents creating duplicate containers for the same check. | ||
| Unlike is_check_container_running(), this checks ALL containers. | ||
| Args: | ||
| check: ManufacturabilityCheck to verify | ||
| Returns: | ||
| True if any container found (running or stopped) | ||
| False if no container found | ||
| None if Docker unavailable | ||
| """ | ||
| try: | ||
| client = docker.from_env() | ||
| except docker.errors.DockerException: | ||
| logger.warning( | ||
| "Failed to connect to Docker - cannot check container for %s", | ||
| check.id, | ||
| ) | ||
| return None | ||
|
|
||
| try: | ||
| containers = client.containers.list( | ||
| all=True, # Include stopped containers | ||
| filters={"label": f"wafer.space.check_id={check.id}"}, | ||
| ) | ||
| except docker.errors.DockerException: | ||
| logger.exception("Error querying Docker for check %s", check.id) | ||
| return None | ||
| finally: | ||
| if hasattr(client, "close"): | ||
| client.close() | ||
|
|
||
| if containers: | ||
| container_info = [(c.short_id, c.status) for c in containers] | ||
| logger.warning( | ||
| "Check %s already has container(s): %s", | ||
| check.id, | ||
| container_info, | ||
| ) | ||
| return True | ||
|
|
||
| return False |
Copilot
AI
Dec 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new does_check_container_exist() function follows the same pattern as is_check_container_running() but lacks test coverage. Since comprehensive tests exist for the similar function in test_verification.py (lines 670-796), tests should be added to cover:
- Container found (returns True)
- No container found (returns False)
- Docker connection failure (returns None)
- Docker query error (returns None)
- Multiple containers found (returns True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
wafer_space/projects/tasks.py (1)
528-546: Duplicate-container guard is a good safety net; note remaining race semanticsThe pre-check using
does_check_container_exist(context.check)and documenting theValueErrorin the docstring nicely prevents starting a new container when Docker already reports one (running or stopped) for this check, while preserving behavior when Docker is unavailable (onlyTrueshort-circuits).Two behavioral nuances to be aware of:
- There is still a narrow race where two workers can both enter
_start_containerconcurrently, each seeing no existing containers and both callingcontainers.run(). Preventing that scenario entirely would require some DB-side coordination (e.g., row-level locking or an invariant on “at most one active RUNNING/DISPATCHED check per id”) rather than relying solely on Docker state.- Raising
ValueErrorhere means a duplicate attempt will currently mark the checkERRORvia the genericValueErrorhandler, even though another worker may be legitimately processing it. If you prefer duplicates to be treated as “already running elsewhere” rather than an error, you may want a more specific exception or a dedicated early-return path.If these semantics are acceptable for now, the added guard is still a meaningful improvement; otherwise, consider tightening with a small state-machine/locking adjustment.
Also applies to: 915-916
wafer_space/projects/verification.py (1)
83-127: Newdoes_check_container_existhelper matches intended semantics; minor DRY opportunityThe tri-state contract (True/False/None), Docker error handling, and client cleanup all align with the docstring and with the patterns already used in
is_check_container_running, making this safe to call from the task layer.If you find yourself adding more container-verification helpers, consider extracting a small internal helper to encapsulate the shared “create client → list containers by label → close client” pattern used here and in
is_check_container_running, to keep the Docker access logic in one place. Not urgent, but would reduce duplication.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
wafer_space/projects/tasks.py(3 hunks)wafer_space/projects/verification.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{py,pyi}
📄 CodeRabbit inference engine (CLAUDE.md)
Lint errors must be fixed, never suppressed. Never add
# noqa,# type: ignore, or similar without explicit user permission. Always runmake lint-fix && make lint && make type-checkbefore committing.
Files:
wafer_space/projects/tasks.pywafer_space/projects/verification.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Never create circular imports. Maintain layer separation: Models never import tasks/views/services; Services can import models and tasks; Views can import models and services; Tasks can import models only. Never use local imports as a workaround for circular dependencies.
Avoid print statements. Uselogging.getLogger(__name__)instead (Ruff rule T201).
Lines must not exceed 88 characters (Ruff rule E501). Break long lines into multiple lines.
Never use f-strings directly in exception messages (Ruff rule EM102). Assign the message to a variable first before using it in the exception.
Avoid boolean positional arguments (Ruff rule FBT002/3). Use keyword-only arguments instead:def fn(*, flag=True).
Never hardcode passwords directly in code (Ruff rule S105). Use constants likeTEST_PASSWORD = '...'instead.
All public functions must have type hints. Usefrom __future__ import annotationsfor forward references. Fix mypy errors, don't ignore them.
Use specific exception handling with exception chaining. Catch specific exception types and useraise NewException(msg) from excto chain exceptions. Never use bareexcept:or overly broadexcept Exception:.
Celery task names inCELERY_BEAT_SCHEDULEmust reference the actual decorated function name (e.g.,'app.tasks.process_check_queue'), not a variable assignment. Task registration comes from the @shared_task decorator, not Python variable assignments.
Files:
wafer_space/projects/tasks.pywafer_space/projects/verification.py
🧬 Code graph analysis (2)
wafer_space/projects/tasks.py (1)
wafer_space/projects/verification.py (1)
does_check_container_exist(83-127)
wafer_space/projects/verification.py (1)
wafer_space/projects/models.py (1)
ManufacturabilityCheck(1114-1822)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Upload results
- GitHub Check: pytest
🔇 Additional comments (1)
wafer_space/projects/tasks.py (1)
57-61: Import of duplicate-container helper is appropriateBringing
does_check_container_existin alongside the other verification helpers keeps related logic centralized and does not introduce a runtime circular import (verification only imports models underTYPE_CHECKING). No issues here.
Summary
does_check_container_exist()function to check for any container (running or stopped) with the check's label_start_container()to abort if a duplicate would be createdValueErrorwhich is caught by existing exception handlingProblem
Race conditions can occur where
check_process_jobruns multiple times for the same check (e.g., Celery retry), potentially creating duplicate Docker containers.Solution
Before creating a new container, query Docker for any existing container with the
wafer.space.check_idlabel matching this check. If found, raise an error to abort the duplicate.Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.