[codex] Gate scheduled runs on data quality#25
Conversation
Review Summary by QodoGate scheduled runs on data quality validation
WalkthroughsDescription• Gate scheduled runs on data quality validation by adding --require-data-quality-pass flag to both the checked-in artifact workflow and scheduled update workflow • Add algorithm_misses to the generated data-quality monitor checks to detect cache/parser health regressions • Enforce strict validation requirements including current schema compliance and data-quality pass status • Update validation documentation in README to reflect the stricter data-quality gate • Refresh all generated API certificate artifacts with updated generation timestamps from the latest scraper run Diagramflowchart LR
A["Data Quality Monitor"] -->|includes algorithm_misses| B["Enhanced Quality Checks"]
B -->|validates| C["validate_api.py"]
C -->|--require-data-quality-pass| D["Scheduled Workflow"]
C -->|--require-data-quality-pass| E["Checked-in Artifacts Workflow"]
D -->|pass required| F["Weekly Update Succeeds"]
E -->|pass required| G["Artifacts Published"]
File Changes1. api/certificates/1017.json
|
Code Review by Qodo
1. Timestamp-only artifact churn
|
What changed
algorithm_missesto the generated data-quality monitor checks.--require-data-quality-passtovalidate_api.py, failing validation when the data-quality monitor or any check is notpass.Why
The previous data-quality endpoint reported run health, but the scheduled workflow would still pass if the monitor warned. This makes the weekly Sunday update actively fail on cache/parser health regressions, including algorithm extraction misses.
Validation
.venv-codex/bin/python -m py_compile scraper.py validate_api.py test_scraper.py.venv-codex/bin/python test_scraper.py.venv-codex/bin/python validate_api.py --require-current-schema --forbid-firecrawl-run-source --require-data-quality-passgit diff --check