Is your feature request related to a problem?
The LearningEngine stores Q-values, algorithm configs, and RL state only in RAM. On MCP server restart, all trained Q-values are lost. hooks_learning_config settings vanish (causing epsilon error). Users must re-bootstrap every session.
Root Cause
export() incomplete — missing eligibilityTraces, actorWeights
import() incomplete — doesn't restore the above Maps
- No dedicated persistence file — learning state mixed into
intelligence.json
- No auto-recovery on MCP startup
rewardHistory unbounded
Describe the solution you'd like
Phase 1: Fix export/import + dedicated learning-state.json
- Complete
export() to serialize all internal Maps
- Complete
import() to restore all Maps with graceful defaults
- Persist to
.ruvector/learning-state.json (atomic write: tmp → rename)
- Auto-load on MCP startup
- Cap
rewardHistory at 500
Phase 2 (future): RVF POLICY_KERNEL persistence
Store Q-tables in POLICY_KERNEL (0x31) segment, extending its semantics beyond Thompson Sampling to include tabular RL (Q-Learning, SARSA, Double-Q, Actor-Critic). This aligns with ADR-029 (RVF Canonical Format) and ADR-036 (AGI Cognitive Container).
PR
Phase 1 implementation: dmoellenbeck#2 (pending)
Environment
- ruvector 0.2.x (npm), Node.js 22.5+, macOS
- Real-world use case: endurance coaching with 200+ Q-learning experiences, daily training adaptation
Is your feature request related to a problem?
The
LearningEnginestores Q-values, algorithm configs, and RL state only in RAM. On MCP server restart, all trained Q-values are lost.hooks_learning_configsettings vanish (causingepsilonerror). Users must re-bootstrap every session.Root Cause
export()incomplete — missingeligibilityTraces,actorWeightsimport()incomplete — doesn't restore the above Mapsintelligence.jsonrewardHistoryunboundedDescribe the solution you'd like
Phase 1: Fix export/import + dedicated
learning-state.jsonexport()to serialize all internal Mapsimport()to restore all Maps with graceful defaults.ruvector/learning-state.json(atomic write: tmp → rename)rewardHistoryat 500Phase 2 (future): RVF POLICY_KERNEL persistence
Store Q-tables in
POLICY_KERNEL(0x31) segment, extending its semantics beyond Thompson Sampling to include tabular RL (Q-Learning, SARSA, Double-Q, Actor-Critic). This aligns with ADR-029 (RVF Canonical Format) and ADR-036 (AGI Cognitive Container).PR
Phase 1 implementation: dmoellenbeck#2 (pending)
Environment