feat: Implement SelfModificationEngine with REST API (issue #95)#128
feat: Implement SelfModificationEngine with REST API (issue #95)#128
Conversation
Co-authored-by: Steake <530040+Steake@users.noreply.github.com>
🧪 CI — Python 3.11 |
🧪 CI — Python 3.10 |
There was a problem hiding this comment.
Pull request overview
Implements an in-memory SelfModificationEngine and exposes it via new REST endpoints in backend/unified_server.py so the existing self-modification UI (issue #95 / PR #41) can propose/apply/rollback changes and verify them via API reads.
Changes:
- Added
SelfModificationEnginewith propose/apply/rollback and ordered history tracking. - Wired the engine into
unified_serverlifespan and added self-modification + modules status endpoints;/api/knowledgenow merges engine knowledge items. - Added backend tests covering engine behavior plus HTTP endpoint flows.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 12 comments.
| File | Description |
|---|---|
backend/core/self_modification_engine.py |
New in-memory engine implementing propose/apply/rollback across multiple targets with snapshot-based rollback and history. |
backend/unified_server.py |
Initializes the engine on startup; adds self-modification endpoints; merges engine knowledge items into /api/knowledge; adds /api/system/modules. |
backend/core/__init__.py |
Wraps eager imports to avoid package-level import failures when optional deps are missing. |
tests/backend/test_self_modification.py |
Adds unit + HTTP tests for proposing, applying, rolling back, and reading modification history. |
| engine_items = self_modification_engine.get_knowledge_items() | ||
| if engine_items: | ||
| existing_concepts = list(base.get("concepts", [])) | ||
| existing_concepts.extend(engine_items) | ||
| result = {**base, "concepts": existing_concepts, "total_count": int(base.get("total_count", 0)) + len(engine_items)} | ||
| return result |
There was a problem hiding this comment.
The merge logic appends engine concepts regardless of knowledge_type and limit, so /api/knowledge?knowledge_type=fact&limit=50 can still return concept entries and exceed the requested limit. Filter/limit the engine-provided items consistently with the endpoint’s query params (or gate the merge to only concept queries).
| engine_items = self_modification_engine.get_knowledge_items() | |
| if engine_items: | |
| existing_concepts = list(base.get("concepts", [])) | |
| existing_concepts.extend(engine_items) | |
| result = {**base, "concepts": existing_concepts, "total_count": int(base.get("total_count", 0)) + len(engine_items)} | |
| return result | |
| # Only merge engine-provided items when concepts are requested (or no specific type is given) | |
| if knowledge_type is None or str(knowledge_type).lower() in {"concept", "concepts"}: | |
| engine_items = self_modification_engine.get_knowledge_items() | |
| if engine_items: | |
| existing_concepts = list(base.get("concepts", [])) | |
| added_items = 0 | |
| # Respect the limit parameter for the total number of concept items | |
| if isinstance(limit, int) and limit > 0: | |
| remaining_slots = max(limit - len(existing_concepts), 0) | |
| if remaining_slots > 0: | |
| to_add = list(engine_items)[:remaining_slots] | |
| merged_concepts = existing_concepts + to_add | |
| added_items = len(to_add) | |
| else: | |
| merged_concepts = existing_concepts | |
| else: | |
| merged_concepts = existing_concepts + list(engine_items) | |
| added_items = len(engine_items) | |
| result = { | |
| **base, | |
| "concepts": merged_concepts, | |
| "total_count": int(base.get("total_count", 0)) + added_items, | |
| } | |
| return result |
| try: | ||
| result = await self_modification_engine.apply_modification(proposal_id) | ||
| except ValueError as exc: | ||
| raise HTTPException(status_code=404, detail=str(exc)) |
There was a problem hiding this comment.
This endpoint maps all ValueErrors from apply_modification() to HTTP 404. The engine raises ValueError for non-not-found cases too (e.g., proposal exists but status is not pending, or parameters are invalid), so callers will get a misleading 404. Consider distinguishing not-found vs invalid-state/validation errors (e.g., 404 vs 409/400) by using dedicated exception types or inspecting the error cause.
| target = payload.get("target", "") | ||
| operation = payload.get("operation", "") | ||
| parameters = payload.get("parameters", {}) | ||
| try: | ||
| proposal = await self_modification_engine.propose_modification( | ||
| target=target, operation=operation, parameters=parameters | ||
| ) |
There was a problem hiding this comment.
parameters is passed through without shape validation. If a client sends "parameters": null (or another non-object), the engine will later crash on .get(...) during apply, returning a 500. Validate that parameters is a dict/object at propose-time (or coerce None to {}) and return a 400 on invalid payloads.
| try: | ||
| from .cognitive_manager import CognitiveManager, get_cognitive_manager | ||
| from .agentic_daemon_system import AgenticDaemonSystem, get_agentic_daemon_system | ||
|
|
||
| __all__ = [ | ||
| "CognitiveManager", | ||
| "get_cognitive_manager", | ||
| "AgenticDaemonSystem", | ||
| "get_agentic_daemon_system" | ||
| ] | ||
| __all__ = [ | ||
| "CognitiveManager", | ||
| "get_cognitive_manager", | ||
| "AgenticDaemonSystem", | ||
| "get_agentic_daemon_system", | ||
| ] | ||
| except ImportError: | ||
| __all__ = [] |
There was a problem hiding this comment.
backend.core.__init__ now swallows any ImportError from these imports and silently exposes an empty __all__, which can mask real bugs (not just optional deps) and make failures harder to diagnose. Consider logging the exception (or re-raising unless the missing module is one of the known-optional deps), so broken imports don’t fail silently.
| def __init__(self) -> None: | ||
| # Per-target in-memory stores | ||
| self._knowledge_graph: Dict[str, Dict[str, Any]] = {} | ||
| self._inference_rules: Dict[str, Dict[str, Any]] = {} | ||
| self._attention_weights: Dict[str, float] = {} | ||
| self._active_modules: Dict[str, Dict[str, Any]] = copy.deepcopy(self._DEFAULT_MODULES) | ||
|
|
||
| # Proposal / history stores | ||
| self._proposals: Dict[str, ModificationProposal] = {} | ||
| self._records: Dict[str, ModificationRecord] = {} | ||
| self._history_order: List[str] = [] # ordered proposal IDs | ||
|
|
There was a problem hiding this comment.
The engine mutates shared in-memory dicts (_proposals, _records, target stores) from async methods without any locking. Under concurrent requests, propose/apply/rollback can interleave and corrupt snapshots/history ordering. Consider guarding state mutations with an asyncio.Lock (as done in other core components) to make operations atomic.
| # Initialize SelfModificationEngine | ||
| try: | ||
| from backend.core.self_modification_engine import SelfModificationEngine | ||
| self_modification_engine = SelfModificationEngine() | ||
| logger.info("✅ SelfModificationEngine initialized") | ||
| except Exception as e: | ||
| logger.error(f"Failed to initialize SelfModificationEngine: {e}") | ||
|
|
There was a problem hiding this comment.
self_modification_engine is assigned only inside the lifespan startup. If initialization fails, the name may never be bound, and later route handlers that reference it will raise NameError/AttributeError rather than returning 503. Define self_modification_engine = None at module scope and set it explicitly to None in the exception path here (optionally log with logger.exception).
| # Proposal / history stores | ||
| self._proposals: Dict[str, ModificationProposal] = {} | ||
| self._records: Dict[str, ModificationRecord] = {} | ||
| self._history_order: List[str] = [] # ordered proposal IDs | ||
|
|
There was a problem hiding this comment.
History/proposal storage is unbounded and stores deep-copied before/after snapshots per apply, so memory usage will grow without limit on a long-running server. Consider adding a retention policy (max records), optional snapshot disabling, or a way to prune proposals/records after some TTL.
| if operation in ("add", "modify"): | ||
| weight = float(parameters.get("weight", 0.5)) |
There was a problem hiding this comment.
component can be empty and weight coercion via float(...) can raise TypeError/ValueError (e.g., null or non-numeric strings). Add validation that component is non-empty and weight is a valid number, and raise a clear validation error so the API can return 400 rather than failing unexpectedly.
| if operation in ("add", "modify"): | |
| weight = float(parameters.get("weight", 0.5)) | |
| if not component: | |
| raise ValueError( | |
| "Parameter 'component' is required for attention_weights operations" | |
| ) | |
| if operation in ("add", "modify"): | |
| raw_weight = parameters.get("weight", 0.5) | |
| try: | |
| weight = float(raw_weight) | |
| except (TypeError, ValueError): | |
| raise ValueError( | |
| f"Invalid 'weight' value for attention_weights: {raw_weight!r}" | |
| ) |
| original = srv.self_modification_engine | ||
| srv.self_modification_engine = engine | ||
| yield app, engine | ||
| srv.self_modification_engine = original |
There was a problem hiding this comment.
This fixture assumes backend.unified_server.self_modification_engine exists at import time (original = srv.self_modification_engine). In the current server code it’s only created during lifespan startup, so this access can raise AttributeError and make the test suite order-dependent. Use getattr(srv, "self_modification_engine", None) and/or ensure the server defines the global as None at module import time.
| original = srv.self_modification_engine | |
| srv.self_modification_engine = engine | |
| yield app, engine | |
| srv.self_modification_engine = original | |
| original_exists = hasattr(srv, "self_modification_engine") | |
| original = getattr(srv, "self_modification_engine", None) | |
| srv.self_modification_engine = engine | |
| yield app, engine | |
| if original_exists: | |
| srv.self_modification_engine = original | |
| else: | |
| # Remove the attribute if we created it in this fixture | |
| if hasattr(srv, "self_modification_engine"): | |
| delattr(srv, "self_modification_engine") |
backend/unified_server.py
Outdated
| @app.post("/api/self-modification/propose") | ||
| async def propose_modification(payload: Dict[str, Any]): | ||
| """Submit a modification proposal. | ||
|
|
||
| Returns ``{ proposal_id, target, operation, parameters, status: "pending" }`` | ||
| """ | ||
| if self_modification_engine is None: | ||
| raise HTTPException(status_code=503, detail="SelfModificationEngine not initialized") |
There was a problem hiding this comment.
These endpoints allow runtime self-modification (including enabling/disabling modules) with no authentication/authorization checks. If the server can be reached by untrusted clients, this is a high-impact security risk. Consider gating the self-modification routes behind an environment-flag (dev-only) or requiring an admin token/header before allowing propose/apply/rollback.
There was a problem hiding this comment.
NO this is the point of godelos, but we should have authenticated session for them
There was a problem hiding this comment.
Added session-token auth in a52067a. All self-modification endpoints now require a valid X-API-Token header when GODELOS_API_TOKEN is set. Auth is skipped when the env var is empty/unset (dev mode).
|
@copilot apply changes based on the comments in this thread |
- Add asyncio.Lock for concurrent safety on all state mutations - Add history retention cap (max 1000 records) with eviction - Add ProposalNotFoundError/ProposalStateError for 404 vs 409 mapping - Validate parameters is dict, non-empty component/weight, non-empty module name - Add session-token auth (X-API-Token header, GODELOS_API_TOKEN env var) - Respect knowledge_type and limit in /api/knowledge merge - Log ImportError in backend/core/__init__.py - Define self_modification_engine=None at module scope - Fix test fixture with getattr; disable lifespan in transport - Add 11 new tests (28 total): auth, 409 conflict, validation edge cases Co-authored-by: Steake <530040+Steake@users.noreply.github.com>
All 12 review comments addressed in a52067a:
|
|
@copilot apply changes based on the comments in this thread |
Description
The self-modification UI (PR #41) had no backend engine. This PR implements
SelfModificationEngine, wires it into the server lifespan, and exposes a REST API the UI can consume.backend/core/self_modification_engine.py(new)knowledge_graph,inference_rules,attention_weights,active_modulesadd,remove,modify,enable,disableasyncio.Lockguards all state mutations for concurrent safetyProposalNotFoundError(→ HTTP 404) andProposalStateError(→ HTTP 409) for precise error semanticsparametersmust be a dict; attention_weights requires non-emptycomponentand valid numericweight; active_modules enable/disable/remove require non-empty module identifierbackend/unified_server.pyself_modification_engine = Nonedefined at module scope; initialized in lifespan startup with explicitNoneon failureX-API-Tokenheader on all self-modification endpoints; readsGODELOS_API_TOKENenv var; auth skipped when unset (dev mode)POST /api/self-modification/propose→{proposal_id, status: "pending", ...}POST /api/self-modification/apply/{id}→{status: "applied", changes_summary}(409 on double-apply)POST /api/self-modification/rollback/{id}→{status: "rolled_back"}(409 on double-rollback)GET /api/self-modification/history→ ordered record listGET /api/system/modules— returns active modules registry from the engineGET /api/knowledge— merges engine's knowledge items into the response, respectingknowledge_typefilter andlimitparameterparametersvalidated at propose-time:nullcoerced to{}; non-object returns 400backend/core/__init__.pytry/exceptwithlogger.warning()— prevents a missingnumpy(or similar optional dep) from propagating as anImportErrorwhile still logging the cause for diagnosticstests/backend/test_self_modification.py(new)28 tests across unit and HTTP layers covering all 4 required scenarios plus edge cases:
Related Issues
Test Evidence
Checklist
pytest tests/)black .andisort .)Original prompt
This pull request was created from Copilot chat.
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.