fix: ensure file extraction continues even when individual extractions fail #231

mithro · 2025-12-16T14:21:28Z

Summary

Refactored do_analyzing to extract container outputs independently, so that if one extraction fails, the others still run
Added _extract_container_outputs() helper function that wraps each file extraction (log file, runs archive, output GDS) in independent try/except blocks
Added 3 tests verifying extraction resilience and DRC error handling

Problem

Previously, if any file extraction raised an exception during the ANALYZING step (e.g., Docker API error, filesystem error), the entire analysis would fail and the check would go to ERROR state without:

Attempting to extract remaining files
Parsing logs to detect DRC errors
Properly transitioning to FINISHED state

Solution

Each extraction is now independent:

If _save_runs_archive fails → still try _save_output_gds
If any extraction fails → still parse logs and detect DRC errors
Extraction errors are logged but don't prevent analysis completion

This ensures we collect as much data as possible even when some extractions fail, and DRC errors are still properly reported to users.

Test plan

test_continues_extraction_when_runs_archive_fails - verifies analysis completes when runs archive extraction fails
test_continues_extraction_when_output_gds_fails - verifies analysis completes when GDS extraction fails
test_continues_analysis_on_drc_errors_with_extraction_failures - verifies DRC errors are detected even when all extractions fail
All 1219 tests pass
Lint and type checks pass

🤖 Generated with Claude Code

Summary by CodeRabbit

Refactor
- Improved robustness of manufacturability analysis error handling; the extraction process now continues when individual steps fail instead of stopping at the first error, providing more reliable analysis completion.
Tests
- Added tests verifying that manufacturability analysis continues successfully when extraction steps fail independently.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…s fail Extract container outputs (logs, runs archive, output GDS) independently so that if one extraction fails, the others still run. This ensures we collect as much data as possible even when Docker API errors or other issues occur during extraction. Previously, if _save_runs_archive raised an exception, _save_output_gds would never run, and the whole analysis would fail. Now each extraction is wrapped in its own try/except, and analysis continues regardless. Adds 3 tests verifying: - Analysis continues when runs archive extraction fails - Analysis continues when output GDS extraction fails - DRC errors are still detected even when all extractions fail 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

coderabbitai · 2025-12-16T14:21:40Z

Walkthrough

The changes extract container output extraction logic into a dedicated helper function with independent error handling, enabling analysis to proceed when individual extraction steps fail. Three tests validate that manufacturability analysis completes successfully despite extraction failures.

Changes

Cohort / File(s)	Change Summary
Extraction Logic Refactoring `wafer_space/projects/tasks_checks.py`	Introduces `_extract_container_outputs` helper function that performs staged extraction (logs, runs archive, output GDS) with independent error handling and error collection. Replaces inline extraction logic in `do_analyzing` with a call to this function.
Resilience Tests `wafer_space/projects/tests/test_tasks.py`	Adds three new test methods: `test_continues_extraction_when_runs_archive_fails`, `test_continues_extraction_when_output_gds_fails`, and `test_continues_analysis_on_drc_errors_with_extraction_failures` to verify analysis completes with FINISHED status when extraction steps fail.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~18 minutes

Verify the new _extract_container_outputs function correctly isolates errors for each extraction step (logs, runs archive, GDS) and continues on failure
Confirm the integration point in do_analyzing properly calls and handles the returned error list
Validate that all three new test scenarios accurately simulate extraction failures and assert expected outcomes (FINISHED status, manufacturability results)

Poem

🐰 A hop through refactored flows,
Where errors don't halt the prose,
Extract in stages, collect the pain,
Let analysis dance on again,
Resilient logic makes tests glow!
✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and clearly summarizes the main change: refactoring file extraction to be resilient to individual failures rather than stopping on the first error.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/drc-error-file-collection

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

wafer_space/projects/tests/test_tasks.py (2)
2037-2042: Consider more precise path matching in mock function.

The string containment check if "/workspace/runs" in path: could theoretically match unintended paths (e.g., "/workspace/runs2" or paths containing that substring). While unlikely to cause issues in practice since Docker paths are specific, using an exact match or path.endswith() would be more robust.
         def mock_get_archive(path: str) -> tuple:
-            if "/workspace/runs" in path:
+            if path.endswith("/workspace/runs"):
                 msg = "Simulated runs extraction failure"
                 raise RuntimeError(msg)
             # Return empty for other paths
             raise docker.errors.NotFound(not_found_msg)
2086-2091: Consider more precise path matching in mock function.

Same as the previous test, the string containment check if "/output/design.gds" in path: could match unintended paths. Consider using path.endswith() for better precision.
         def mock_get_archive(path: str) -> tuple:
-            if "/output/design.gds" in path:
+            if path.endswith("/output/design.gds"):
                 msg = "Simulated GDS extraction failure"
                 raise RuntimeError(msg)
             # Return not found for other paths
             raise docker.errors.NotFound(not_found_msg)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0507df2 and 78e7148.

📒 Files selected for processing (2)

wafer_space/projects/tasks_checks.py (2 hunks)
wafer_space/projects/tests/test_tasks.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{py,pyi}

📄 CodeRabbit inference engine (CLAUDE.md)

Lint errors must be fixed, never suppressed. Never add # noqa, # type: ignore, or similar without explicit user permission. Always run make lint-fix && make lint && make type-check before committing.

Files:

wafer_space/projects/tests/test_tasks.py
wafer_space/projects/tasks_checks.py

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Never create circular imports. Maintain layer separation: Models never import tasks/views/services; Services can import models and tasks; Views can import models and services; Tasks can import models only. Never use local imports as a workaround for circular dependencies.
Avoid print statements. Use logging.getLogger(__name__) instead (Ruff rule T201).
Lines must not exceed 88 characters (Ruff rule E501). Break long lines into multiple lines.
Never use f-strings directly in exception messages (Ruff rule EM102). Assign the message to a variable first before using it in the exception.
Avoid boolean positional arguments (Ruff rule FBT002/3). Use keyword-only arguments instead: def fn(*, flag=True).
Never hardcode passwords directly in code (Ruff rule S105). Use constants like TEST_PASSWORD = '...' instead.
All public functions must have type hints. Use from __future__ import annotations for forward references. Fix mypy errors, don't ignore them.
Use specific exception handling with exception chaining. Catch specific exception types and use raise NewException(msg) from exc to chain exceptions. Never use bare except: or overly broad except Exception:.
Celery task names in CELERY_BEAT_SCHEDULE must reference the actual decorated function name (e.g., 'app.tasks.process_check_queue'), not a variable assignment. Task registration comes from the @shared_task decorator, not Python variable assignments.

Files:

wafer_space/projects/tests/test_tasks.py
wafer_space/projects/tasks_checks.py

**/tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/tests/**/*.py: Use factory-boy for creating test data, not pytest fixtures. Write one assertion concept per test. Mirror test file structure to source: wafer_space/users/models.py → wafer_space/users/tests/test_models.py.
All browser tests run headless by default. Never use the --visible flag in automated tests (blocked with error). Use WebDriverWait, never time.sleep(). Screenshots are auto-captured on failure to tests/browser/screenshots/. Use Page Object pattern for browser tests.

Files:

wafer_space/projects/tests/test_tasks.py

🧠 Learnings (1)

📚 Learning: 2025-12-01T23:49:29.655Z

Learnt from: CR
Repo: wafer-space/platform.wafer.space PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-01T23:49:29.655Z
Learning: Applies to **/tests/**/*.py : Use factory-boy for creating test data, not pytest fixtures. Write one assertion concept per test. Mirror test file structure to source: `wafer_space/users/models.py` → `wafer_space/users/tests/test_models.py`.

Applied to files:

wafer_space/projects/tests/test_tasks.py

🧬 Code graph analysis (2)

wafer_space/projects/tests/test_tasks.py (2)

wafer_space/projects/models.py (4)

ManufacturabilityCheck (1438-2529)

Status (130-139)

Status (1149-1153)

Status (1441-1507)

wafer_space/projects/tasks_checks.py (1)

do_analyzing (1716-1821)

wafer_space/projects/tasks_checks.py (1)

wafer_space/projects/models.py (1)

ManufacturabilityCheck (1438-2529)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: pytest

🔇 Additional comments (5)

wafer_space/projects/tests/test_tasks.py (2)

2109-2154: LGTM! Important edge case covered.

This test validates critical behavior: DRC errors from log parsing are captured and reported even when all file extractions fail. This ensures users can still see what went wrong with their design despite infrastructure issues.

2010-2154: Excellent test coverage for resilient extraction.

The three new tests comprehensively verify that analysis completes successfully when individual extraction steps fail, covering:

Runs archive extraction failure

Output GDS extraction failure

DRC errors with all extractions failing

The tests follow the factory-boy pattern as required by coding guidelines and maintain consistency with existing test structure.

wafer_space/projects/tasks_checks.py (3)

1577-1594: LGTM! Well-designed resilient extraction helper.

The function signature, type hints, and docstring clearly communicate the resilient behavior: each extraction is independent, failures are collected but don't stop execution. This achieves the PR's goal of ensuring analysis completes even when individual extractions fail.

1595-1634: Appropriate use of broad exception handling for resilience.

The broad except Exception: blocks are justified here since the goal is to collect as much data as possible even if individual steps fail. Each exception is properly logged with full traceback via logger.exception(), maintaining observability while ensuring resilience.

Per coding guidelines, specific exception handling is preferred, but this is a valid exception (pun intended) to that rule given the explicit design goal of fault tolerance during extraction.

1767-1771: LGTM! Proper cleanup guaranteed via finally block.

The refactoring successfully centralizes extraction logic while ensuring temporary directory cleanup happens regardless of extraction outcomes. Not capturing the extraction errors from _extract_container_outputs is intentional—they're logged for monitoring but don't affect the analysis result, which is determined by log parsing.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot AI review requested due to automatic review settings December 16, 2025 14:21

coderabbitai bot reviewed Dec 16, 2025

View reviewed changes

Copilot started reviewing on behalf of mithro December 16, 2025 14:29 View session

Copilot AI reviewed Dec 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: ensure file extraction continues even when individual extractions fail #231

fix: ensure file extraction continues even when individual extractions fail #231

Uh oh!

mithro commented Dec 16, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 16, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: ensure file extraction continues even when individual extractions fail #231

Are you sure you want to change the base?

fix: ensure file extraction continues even when individual extractions fail #231

Uh oh!

Conversation

mithro commented Dec 16, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mithro commented Dec 16, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 16, 2025 •

edited

Loading