fix: repair all 6 knowledge import endpoints (0/6 → 6/6 working)#115
fix: repair all 6 knowledge import endpoints (0/6 → 6/6 working)#115
Conversation
- Separate knowledge_ingestion_service import from knowledge_management
and knowledge_pipeline so thinc/spaCy failures don't disable ingestion
- Add GET /api/knowledge/import/status/{job_id} endpoint
- Add DELETE /api/knowledge/import/cancel/{job_id} endpoint
- Fix batch endpoint to use actual ingestion service
- Fix cancel endpoint to use actual ingestion service
- Fix HTTPException swallowing in URL/text/file/wikipedia endpoints
- Remove dead code (duplicate except block) in URL import
- Add 18 pytest tests for all 6 endpoints via httpx.AsyncClient
Co-authored-by: Steake <530040+Steake@users.noreply.github.com>
…back ID Co-authored-by: Steake <530040+Steake@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Repairs the knowledge import API surface on backend/unified_server.py so the ingestion endpoints function even when optional knowledge-related dependencies fail to import, and adds/updates endpoint implementations + tests to cover the full 6-endpoint contract.
Changes:
- Decouples knowledge service imports so ingestion can remain available independently of management/pipeline imports.
- Adds missing status/cancel alias endpoints and replaces stubbed batch/cancel logic with ingestion-service backed behavior.
- Adds an async HTTP-layer test suite covering all 6 knowledge import endpoints and updates an existing cancel assertion to allow
not_found.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| backend/unified_server.py | Splits knowledge service imports; adds status/cancel aliases; fixes HTTPException handling; implements batch/cancel behavior via ingestion service. |
| tests/backend/test_knowledge_import_endpoints.py | New async tests for all 6 knowledge import endpoints using httpx.ASGITransport and a mocked ingestion service. |
| tests/backend/test_api_endpoints.py | Relaxes cancel assertion to accept either cancelled or not_found. |
backend/unified_server.py
Outdated
| logger.error(f"Error importing knowledge from file: {e}") | ||
| raise HTTPException(status_code=500, detail=f"File import error: {str(e)}") |
There was a problem hiding this comment.
The except Exception as e: block has over-indented statements (logger.error / raise) which will cause an IndentationError and prevent backend/unified_server.py from importing. Align the contents of the generic except block with the except indentation level.
| logger.error(f"Error importing knowledge from file: {e}") | |
| raise HTTPException(status_code=500, detail=f"File import error: {str(e)}") | |
| logger.error(f"Error importing knowledge from file: {e}") | |
| raise HTTPException(status_code=500, detail=f"File import error: {str(e)}") |
backend/unified_server.py
Outdated
| @@ -2664,6 +2683,8 @@ async def import_knowledge_from_file(file: UploadFile = File(...), filename: str | |||
| "file_type": file_type | |||
There was a problem hiding this comment.
import_knowledge_from_file computes determined_file_type and passes it into FileImportRequest, but the HTTP response returns the raw file_type form field (which may be None/different). Return the actual determined_file_type so clients see the true interpreted type.
| "file_type": file_type | |
| "file_type": determined_file_type |
backend/unified_server.py
Outdated
| else: | ||
| import_ids.append(fallback_id) | ||
| except Exception as exc: | ||
| logger.warning(f"Batch item {i} failed: {exc}") | ||
| import_ids.append(fallback_id) | ||
|
|
||
| return {"import_ids": import_ids, "batch_size": len(import_ids), "status": "queued"} |
There was a problem hiding this comment.
import_knowledge_batch appends a generated fallback_id when ingestion is unavailable or when a batch item fails, but still returns status: "queued" without indicating which items failed to enqueue. This can mislead clients into thinking the entire batch was accepted. Consider returning per-item results (id + status/error) and/or using a non-2xx response for invalid items when the ingestion service is available.
| level so that we test the HTTP layer in isolation. | ||
| """ | ||
|
|
||
| import asyncio |
There was a problem hiding this comment.
Unused import asyncio (not referenced anywhere in this test module). Removing it avoids lint noise and keeps the test focused.
| import asyncio |
- Use BatchImportSchema from shared schemas for batch endpoint - Fix file_type response to return determined_file_type - Fix over-indented except block in file import handler - Add per-item results to batch response - Remove unused asyncio import from test module - Update validation tests: expect 422 for Pydantic validation errors Co-authored-by: Steake <530040+Steake@users.noreply.github.com>
Merged main into the branch at |
…owledge-endpoints # Conflicts: # backend/unified_server.py # godelOS/core_kr/knowledge_store/__init__.py
🧪 CI — Python 3.11 |
🧪 CI — Python 3.10 |
Description
All 6 knowledge import endpoints returned 503 because a single
try/exceptblock coupledknowledge_ingestion_service,knowledge_management_service, andknowledge_pipeline_serviceimports together. Whenthinc(spaCy dep) was unavailable, the entire block failed, settingKNOWLEDGE_SERVICES_AVAILABLE = False.Import chain fix
try/exceptblocks so ingestion works even when management/pipeline deps are missingMissing endpoints
GET /api/knowledge/import/status/{job_id}(delegates to existing progress handler)DELETE /api/knowledge/import/cancel/{job_id}(delegates to cancel logic)Stub replacements
POST /api/knowledge/import/batch— was generating fake IDs; now usesBatchImportSchemafrom the shared schema layer and routes throughKnowledgeIngestionServicefor text/URL sources. Returns per-itemresultsarray with individual status/error for transparency.DELETE /api/knowledge/import/{import_id}— was hardcoded"cancelled"; now callscancel_import()and reports actual stateBug fixes
except Exceptionwas swallowingHTTPException(400)and re-raising as 500 in all four import endpoints (url/text/file/wikipedia). Addedexcept HTTPException: raisebefore the generic handler.exceptblock in file import handler that would causeIndentationErrordetermined_file_typeinstead of the raw form fieldSchema integration (merged from main)
backend/schemas.py(URLImportSchema,TextImportSchema,WikipediaImportSchema,BatchImportSchema) as introduced by PR fix: resolve backend-frontend API contract mismatches (shared schema definitions) #114. Missing required fields now return 422 (Pydantic validation) instead of manual 400 checks.Related Issues
Test Evidence
18 new async tests via
httpx.AsyncClientagainst the ASGI app (no live server). All 146 backend + integration tests pass:CodeQL: 0 alerts (Python + JavaScript).
Checklist
pytest tests/)black .andisort .)Original prompt
🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.