Skip to content

feat: modality pipeline, service mesh, foveated context, research orchestration#2

Merged
chazmaniandinkle merged 69 commits intocogos-dev:mainfrom
chazmaniandinkle:main
Apr 13, 2026
Merged

feat: modality pipeline, service mesh, foveated context, research orchestration#2
chazmaniandinkle merged 69 commits intocogos-dev:mainfrom
chazmaniandinkle:main

Conversation

@chazmaniandinkle
Copy link
Copy Markdown
Contributor

Summary

Accumulated development from the private workspace (68 commits):

  • Modality bus: types, pipeline, HTTP module, text module, wire protocol, tests
  • ModalityPipeline wired into cog serve daemon with auto-discovery from service CRDs
  • /v1/speak kernel endpoint for voice output routing
  • Service mesh with health monitoring and capability advertisement
  • Foveated context engine with iris-driven variable-resolution rendering
  • Research orchestration subsystem
  • Session lifecycle management (kernel-owned sessions, v2.4.0)
  • Bus event query API — filtered events, stats, cross-bus search
  • Workspace indexer — structural proprioception with tree-sitter AST parsing
  • Autoresearch pipeline — signal extraction, nightly consolidation, ablation harnesses
  • TRM evaluation — sigmoid normalization, path dedup, honest NDCG metrics (0.878 mean)
  • OpenAI-compatible provider for LM Studio, vLLM, llama.cpp
  • Codex provider + sandbox agent harness
  • E2e test suite with Ollama integration
  • CI/CD workflows, release automation, changelog
  • Professional docs — README rewrite, EVALUATION.md, CONTRIBUTING.md

Also reconciles the fork divergence (unrelated histories merge) and fixes a prompt test assertion to match actual buildPrompt output format.

Test plan

  • go build ./... passes
  • go vet ./... passes
  • go test ./... passes (except known TestExistingRoles — needs .cog/roles dir)

🤖 Generated with Claude Code

chazmaniandinkle and others added 30 commits January 22, 2026 16:05
Extracted from cog-workspace/.cog/ into standalone repository.

- Core kernel: memory, inference, ledger, serve
- Go SDK for external integration
- Multi-platform build system
- Shell scripts for extended commands

The kernel provides:
- Hierarchical Memory Domains (HMD)
- Multi-provider inference routing (Claude, OpenRouter, local)
- OpenAI-compatible API server
- Workspace coherence tracking

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Major refactoring of inference.go and serve.go with
improved architecture and code organization.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ensity

Add composable "connection threads" to the CogField graph endpoint.
Each thread is a named edge source with distinct weight, computed
server-side and toggled client-side:

- explicit: existing doc_references (313 edges)
- shared_tags: docs sharing 2+ non-date tags (up to 5K edges)
- siblings: same parent directory, groups of 2-19
- temporal: created same day, groups of 2-30

Also fixes the top-K most-connected algorithm (was partial selection
sort, now uses sort.Slice).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…cement

24-fix audit spanning security, concurrency, and reliability:

Security:
- Bind HTTP server to 127.0.0.1 (was 0.0.0.0)
- Add http.MaxBytesReader on all POST endpoints (64KB/1MB limits)
- Restrict CORS origins to localhost only
- Add path traversal prevention in memory read/write
- Fix grep flag injection via "--" end-of-options separator
- Restrict WebSocket origins to localhost patterns

Concurrency:
- Add safeSend() closure preventing goroutine deadlock on CLI streaming
- Add sync.Mutex for signal field read-modify-write operations
- Convert DebugMode from bool to atomic.Bool for cross-goroutine access
- Return defensive copies from Registry.Get() to prevent data races
- Add sync.Once constellation singleton (eliminates 2-5ms/request overhead)
- Add StartRegistryCleanup() with 5-minute GC ticker

Timeouts:
- Wrap all bare exec.Command calls with exec.CommandContext
- Git operations: 30s timeout via gitCmd()/gitCommandCtx() helpers
- Hook/script execution: 60s timeout via hookCmd() helper
- TUI inference: 5min timeout
- Task orchestration: 10min timeout
- Annotate intentional bare calls with // bare-ok

Build:
- Version bump to 2.1.0
- Atomic install: build → verify → backup → .tmp → mv
- Add `make lint` target enforcing no bare exec.Command
- Add `make test` with unit tests + smoke tests
- Fix Makefile source dependencies (wildcard *.go)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…aph refinements

Expand CogField backend with new data endpoints and decomposed source files:

- Add cogfield_sessions.go: session listing with constellation search
- Add cogfield_documents.go: document retrieval and detail views
- Add cogfield_buses.go: CogBus event streaming endpoint
- Add cogfield_adapters.go: adapter registry and status endpoints
- Refine graph thread building: filter session↔session tag cliques,
  exclude generic tags (session, claude-code), improve edge weighting
- Use constellation singleton for all DB access (from kernel hardening)
- Add EXTRACTION_GAPS.md tracking divergences between extracted and
  workspace kernel

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename the TAA context assembly flag from --taa/-t to --profile/-p based
on cross-tool CLI terminology analysis (AWS, OpenClaw, k8s, Docker, etc).
Add cog info command showing kernel, workspace, server, bus, and CLI
tool versions — modeled after helm/kubectl version output.

Also includes accumulated kernel work: bus session management,
reconciliation engine, Discord IaC provider, agent provider,
and test coverage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The SSE streaming endpoint emitted token usage in a non-standard format
(custom event_type field, wrong chunk ordering) that OpenAI-compatible
SDKs couldn't parse. Consumers like OpenClaw's pi-ai showed zero tokens.

Changes:
- Add Usage field to StreamChunk struct
- Emit finish_reason chunk first (with usage attached)
- Emit dedicated usage-only chunk after (choices:[], standard format)
- Remove non-standard event_type:"usage" wrapper
- Bump kernel version to 2.1.1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… decisions

Add --allowed-tools forwarding from HTTP API to Claude CLI, bridging
the tool gap between OpenClaw and the kernel. OpenAI-format tool
definitions are auto-mapped to CLI names (exec→Bash, read→Read, etc.)
with explicit override via X-Allowed-Tools header.

Add OpenTelemetry tracing foundation with nested spans across the
full request lifecycle: HTTP handler → inference dispatch → CLI exec,
including tool_use/tool_result events and token counts. Zero overhead
when OTEL_EXPORTER_OTLP_ENDPOINT is unset (noop tracer).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ation (ADR-060 Phase 1)

Kernel-level component topology awareness: registry manifest, content-addressable
indexer with Merkle root, reconciliation provider, and cognitive field adapter.

- cog components [list|status|index|register] CLI commands
- Component blobs at .cog/.state/components/ with SHA-256 blob hashes
- ComponentProvider (Reconcilable) for drift detection (report-only Phase 1)
- ComponentAdapter (BlockAdapter) emitting infrastructure sector hexagons
- Fix nil deref in cmdGenericStatus when no reconcile state exists

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ands

MCP Bridge (Phase 2 — ADR dual-loop resolution):
- Extract MCP server from .cog/mcp.go into apps/cogos/mcp.go with bridge mode
- Add OpenClawBridge HTTP client (FetchToolManifest, ExecuteTool)
- Add case "mcp" dispatch in cog.go routing to cmdMCP
- Add InferenceRequest MCP fields (MCPConfig, OpenClawURL, OpenClawToken, SessionID)
- Add generateMCPConfig() temp file generation for --mcp-config flag
- Parse X-OpenClaw-URL/Token headers in serve.go for bridge mode activation
- Bridge merges cogos_* (local) + openclaw_* (proxied) tool namespaces
- Excludes browser/canvas tools (Phase 3)

URI notation:
- Add PathToURI helper for cog:// URI output in serve status/start
- Convert filepath output to URIs in discord_reconcile, salience, validation
- Add URI fragment resolution (cog://mem/path#section)

Section-aware memory:
- Add sections.go with markdown section parser (GetSection, ListHeadings)
- Add complete.go with zsh completion generation
- Add cog memory toc, --section flag, --frontmatter flag
- Add cog memory index / bulk index commands
- Auto-generate section indices on memory write

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… workspace cwd fix

MCP Bridge - Push-based tool registry:
- Replace static openClawCoreTools() manifest with dynamic tool registry
- Tools flow from request body (OpenAI format) → convertOpenAIToolsToMCP → TOOL_REGISTRY env var
- Bridge reads registry on startup via LoadToolRegistry(), no HTTP discovery needed
- FetchToolManifest → ProbeGateway (connectivity check only, no tool enumeration)
- Delete 165 lines of hardcoded tool definitions

OTEL Tracing - Full bridge subprocess instrumentation:
- Bridge subprocess initializes its own tracer via OTEL_EXPORTER_OTLP_ENDPOINT env var
- Trace context propagation via TRACEPARENT env var (W3C Trace Context)
- generateMCPConfig passes both TRACEPARENT and OTEL endpoint to bridge
- New spans: mcp.bridge.activate, mcp.bridge.probe_gateway, mcp.tool.call, openclaw.tool.execute
- Tool call spans include routing decision (local vs remote), HTTP status, request/response sizes
- ExecuteTool injects trace context into outgoing HTTP headers for distributed tracing

Inference - Workspace cwd fix:
- Set cmd.Dir to workspace root in both RunInference and RunInferenceStream
- Claude CLI now operates in the correct project context instead of inheriting kernel's cwd

CogField - Reactive conditions and reconciliation:
- Field-reactive condition evaluation after reconcile cycles
- Signal and reconcile adapters for cognitive field integration

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude CLI now runs in the caller-specified workspace directory when
provided via UCP Workspace.Root, instead of always defaulting to the
kernel's workspace. This enables OpenClaw agents to operate within
their own workspace context while sharing the same CogOS kernel.

Priority chain: UCP workspace root → kernel workspace → inherited cwd.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a middle tier to the workspace resolution chain for callers
(like OpenClaw) that don't send UCP headers. When set, Claude CLI
processes default to this workspace instead of the kernel workspace.

Priority: UCP workspace → DEFAULT_CLIENT_WORKSPACE → kernel workspace.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…gnore binary

- Makefile: install to ~/.cog/bin/cogos (node-level) instead of workspace-relative .cog/cog
- sdk/cogos.go: rename "memory" projector to "mem" to match URI scheme
- .gitignore: add cogos binary

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…penAI-compatible streaming

Extract ~3.4k LOC of inference logic into a separate `harness/` Go module,
connected to the kernel via a clean KernelServices interface. This separation
aligns with the architecture where the kernel resolves and the harness interfaces.

## Harness extraction (harness/)

New package with its own go.mod — zero imports of package main:

- harness.go     — Harness struct, RunInference, RunInferenceStream dispatch
- claude.go      — BuildClaudeArgs, Claude CLI execution (sync + stream)
- http.go        — OpenAI-compatible HTTP providers (sync + stream)
- providers.go   — ProviderType routing, ParseModelProvider, DefaultProviders
- types.go       — InferenceRequest/Response, ContextState, ChatMessage
- stream.go      — StreamChunkInference, Claude/OpenAI wire types
- registry.go    — RequestRegistry with TTL cleanup
- retry.go       — Error classification and retry with backoff
- tools.go       — OpenAI→Claude CLI tool name mapping
- config.go      — Three-tier config resolution (node/workspace/env)
- interfaces.go  — KernelServices contract (the bridge)
- otel.go        — OpenTelemetry tracer initialization

kernel_harness.go bridges kernel→harness with type converters and
kernelServicesAdapter implementing KernelServices.

## Multi-workspace serving

- constellation_singleton.go: per-workspace DB pool (sync.Map)
- serve.go: workspaceContext, workspaceMiddleware, per-request kernel
- cogfield*.go: per-request workspace in graph/query/adapter handlers
- bus_stream.go: per-workspace busChat resolution
- bus_api.go: new REST API for bus event queries

## OpenAI-compatible streaming improvements

- SSE tool events now use standard choices[].delta.tool_calls[] format
  (previously used custom format with empty choices[] that SDKs ignored)
- ChatMessage extended with ToolCalls/ToolCallID for multi-turn history
- Message flattening preserves tool call/result context
- UsageInfo extended with Anthropic cache tokens (cache_read, cache_create, cost)
- HTTP streaming requests include stream_options.include_usage
- MCP config auto-generated in harness when OpenClaw URL is set

## Tests

harness_test.go: 20 tests covering BuildClaudeArgs, tool mapping,
MCP config generation, provider parsing, config loading, error classification.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ss extraction

Now that inference execution is delegated to the harness package, these
functions and types are unreachable from any live code path:

Removed functions:
- RunInference, RunInferenceStream, RunInferenceWithRetry
- runHTTPInference, runHTTPInferenceStream
- buildClaudeArgs, generateMCPConfig
- mapToolsToCLINames, mapToolName, buildContextMetrics
- classifyError, classifyHTTPError, injectContinuationContext

Removed types:
- OpenAIChatRequest, OpenAIChatMessage, OpenAIChatResponse, OpenAIStreamChunk
- DefaultMaxRetries, DefaultTimeout, BaseRetryDelay constants

Removed files:
- inference_test.go (tested dead functions; harness_test.go covers these)

Retained (still used by kernel_harness.go converters and serve.go handlers):
- InferenceRequest/Response, ContextState, StreamChunkInference, etc.
- ProviderType, ProviderConfig, DefaultProviders
- RequestRegistry, ContinuationState, readContinuationState
- CLI commands, event emission, signal management

inference.go: 3416 → 1485 lines

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude Code sets CLAUDECODE in its process environment. When CogOS
spawns `claude` CLI as a subprocess, inheriting this variable causes
the CLI to refuse with "cannot be launched inside another session."

Filter it out of the subprocess environment in both sync and streaming
inference paths, allowing CogOS to run inside Claude Code sessions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude CLI handles tool execution internally (Bash, Read, Grep, etc.)
and returns the final text result. Previously, CogOS relayed these
internal tool_use events through the SSE stream as standard OpenAI
choices[].delta.tool_calls, causing OpenClaw to interpret them as
actionable tool calls and fail with "Tool not found" for each one.

Changed tool_use and tool_use_delta SSE events to use empty choices[]
(informational only). CogOS-aware clients can still observe tool
activity via the event_type and tool_call fields. Removed now-unused
StreamToolCallDelta/StreamChoiceDelta types.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Go's http.Server.WriteTimeout is an absolute deadline that kills SSE
connections after exactly 5 minutes. Use http.NewResponseController to
push the write deadline forward before every SSE write and keepalive
flush, converting the hard cap into a rolling idle timeout.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mirrors the bus_stream.go fix (0de0a7a) in handleStreamingResponse.
Without this, Go's absolute WriteTimeout (5 min) kills long-running
inference SSE streams. Uses http.NewResponseController to push the
deadline forward before each SSE write, converting to idle-based timeout.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wire agent CRD definitions (.cog/bin/agents/definitions/*.agent.yaml)
into the CogOS runtime:

- agent_crd.go: CRD loader (LoadAgentCRD, ListAgentCRDs, GetAgentCRDToolPolicy)
- openclaw_agent_projector.go: Reconcilable provider that projects CRDs into
  OpenClaw's openclaw.json (cog reconcile --resource openclaw-agents)
- serve.go: Agent-aware tool policy enforcement in handleChatCompletions —
  looks up agent CRD by UCP Identity name, applies allowedTools from CRD
- harness/claude.go: --dangerously-skip-permissions now conditional on CRD
- harness/interfaces.go: KernelServices.GetAgentToolPolicy method
- kernel_harness.go: Adapter implementation for agent tool policy lookup

Priority chain for tool restrictions:
  1. X-Allowed-Tools header (explicit caller override)
  2. Agent CRD spec.modelConfig.allowedTools (policy enforcement)
  3. req.Tools mapped to CLI names (request body fallback)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The CRD's dangerouslySkipPermissions field is for direct Claude Code
shell usage, not the CogOS harness. The subprocess has no TTY — without
the flag it would hang waiting for interactive approval. Tool restriction
is enforced via --allowed-tools, not permission prompting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace map-based marshal (which alphabetizes keys) with targeted
value replacement that preserves the original file structure byte-for-byte.
Only the agents.list array is rewritten; everything else stays untouched.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…on and moves

Two bugs in the Discord reconciler:

1. applyPlan() could not resolve parent_id for pre-existing categories during
   channel creation — only tracked newly-created category IDs. Now fetches live
   channels and builds a category name→ID map for fallback resolution.

2. computePlanWithState() only detected channel moves when a channel had a
   *wrong* parent (different category), but not when it had *no* parent (nil
   ParentID). Channels created outside a category were never moved into the
   correct one. Now also handles the nil → category case and orphaned channel
   name matching.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…, channels, and gateway provider

Bus event system:
- bus watch CLI (cog bus watch/list/tail) with schema-driven filtering
- bus tool router for cross-agent RPC dispatch
- bus capabilities advertisement and caching
- event discord bridge formatting for all block types
- reactor for deterministic event-driven rules
- block index for content-addressed lookups

BEP sync engine:
- Protocol Buffers wire format for block exchange
- TLS-secured peer connections
- Index-based efficient sync with bloom filters
- Provider integration for reconciliation framework

Channel bridge:
- Channel config loading and lookup
- CLI commands (cog channel post/send/read/list)
- Webhook config for events channel

Gateway and cluster:
- OpenClaw gateway provider with health, plan, apply
- OpenClaw cron projector
- Gateway client for remote API calls
- Node and cluster CLI commands

Kernel enhancements:
- CogBlock V2 type with full-envelope hashing
- System event type constants
- Memory scoping per agent
- Inference enriched payloads (model, tokens, trace)
- Reconciliation serve endpoint for remote management
- Agent CRD extensions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
OCI content-addressed binary store (.cog/oci/) enables zero-downtime
kernel updates. Running kernel watches index.json via fsnotify, compares
layer digests (not manifest digests — those change on every push due to
timestamps), pulls new binary, drains in-flight requests, and
syscall.Exec replaces the process. Same PID, same args, same port.

SSE bus-stream now evicts oldest subscriber when at capacity instead of
rejecting new connections, preventing the openclaw-gateway reconnection
storm from monopolizing all slots.

New files: oci.go (OCI layout store via oras-go), cmd_oci.go (CLI).
Modified: serve.go (watcher + re-exec), bus_stream.go (eviction),
cog.go (register oci command, version shows digest), Makefile (push target).

Developer workflow: make push → kernel auto-reloads.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Clients (OpenClaw, BrowserOS) connect to /v1/events/stream without
bus_id, which previously returned 400. Now defaults to wildcard "*"
subscription that receives events from ALL buses — matching the older
kernel behavior. Replay is skipped for wildcard subscribers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ing pipeline

Add vector embedding infrastructure to the TAA Tier 4 semantic retrieval
pipeline, replacing pure BM25 keyword search with a hybrid scoring system
that can learn from production data.

Phase A (Embed): nomic-embed-text-v1.5 server, Go client, 768/128-dim
  BLOB storage in constellation schema, cosine similarity utilities.
Phase B (Index): batch backfill indexer, async write hook, freshness checker.
Phase C (Score): query-time embedding, dual heuristic+embedding scoring,
  variable-resolution loading (full/section/metadata by rank), shadow logging.
Phase D (Observe): post-inference response embedding hook, usefulness labeler
  (useful/wasted/missed), training data JSONL writer.
Phase E (Train): data prep, scikit-learn linear probe training, Go probe scorer,
  gradual blend switchover, automated retraining pipeline.
Phase U (Unify): score-driven dynamic budget allocation with safety floors
  (Identity≥15K, Temporal≥10K, Present≥20K, Semantic≥5K).

All features default to enabled: false for safe incremental activation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ring

Add POST /v1/context/foveated endpoint that accepts iris signals (context
window size + used tokens) and returns variable-resolution context. Budget
scales proportionally: effectiveBudget = min(irisAvailable, profileMax).

Tier 4 semantic search now uses score-based thresholds instead of rank-based
cutoffs — thresholds scale quadratically with pressure so resolution degrades
gracefully under context window pressure.

New functions: ConstructContextStateWithIris, QueryConstellationWithIris.
24 new tests covering budget scaling, threshold math, endpoint behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace p² pressure scaling with the SRC-derived isometry defect
δ(p) = 2p - p². Derived from ρ²(r) = (2/3)·e^(-2r) under pressure-delay
mapping r = -ln(1-p). More aggressive at moderate pressure — matches
front-loaded fidelity loss from the covariance structure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
chazmaniandinkle and others added 23 commits April 7, 2026 22:06
Codex gpt-5.4 review findings resolved:
- All acronyms expanded on first use (SSM, PLE, LoRA, LoRO, TRM, D_STATE)
- openclaw-plugin added to all ecosystem tables
- skills README: git-forensics corrected in cogos-dev-tools listing
- All port 5100 references updated to 6931 in source + manifests
- mod3 clone URL updated to cogos-dev/mod3
- mod3 incomplete sentence fixed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The version command outputs "cogos version=dev build=..." but the test
grepped for "cogos build=" (adjacent). Fixed to "cogos.*build=".

Also replaced bare $? check with inline if/grep to avoid set -e
triggering on the intermediate grep failure.

15/15 e2e tests pass on fresh workspace with alternate port.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New test script that validates the complete CogOS stack against a
running Ollama instance with real model inference:

- Pre-flight: Ollama availability + model check
- Init: fresh workspace with seeded CogDocs
- Serve: daemon on configurable alternate port
- API: health, identity, context, nucleus
- Chat: real OpenAI-compatible completion + streaming SSE
- Foveated: context assembly with seed documents
- Memory: search returns seeded docs
- Coherence: workspace validation
- Messages: Anthropic API compatibility
- Shutdown: clean exit

20 tests. Configurable via E2E_MODEL, E2E_PORT, OLLAMA_HOST.
Default model: qwen3.5:0.8b (fast). Override for full test:
  E2E_MODEL=gemma4:e4b bash scripts/e2e-integration.sh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two agent implementations:
- sandbox-agent.py: Python tool-use loop with sandbox enforcement
- sandbox-agent.sh: Bash version (simpler, larger tool JSON issues)

Tested: Gemma 4 26B generates correct tool calls (find, wc),
sandbox blocks dangerous commands, agent completes in 2 turns.

Also validated: Claude Code works with CogOS as backend via
ANTHROPIC_BASE_URL=http://localhost:6931 for non-tool-use inference.
Tool use requires Anthropic→OpenAI tool schema translation (future).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adapted from the existing 198-experiment autoresearch program.
Uses Claude Code as the agent harness with:
- Local inference (gemma4:26b via CogOS) for exploration (free)
- Remote inference (Claude API) for synthesis checkpoints (budgeted)
- CogOS foveated context for workspace awareness
- Same adaptive search strategy (EXPLORE/EXPLOIT/PLATEAU/RECOVER)
- Safety: sandboxed, git-committed, revertible
- Clock-time moderation: local for legwork, remote for thinking

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
overnight-ablation.py: Four-condition ablation measuring context
assembly value. Conditions: A (stock), B (RAG), C (foveated),
D (foveated+tools). 15 workspace QA questions + 5 MMLU-Pro control.
Computes differentials: what each CogOS layer adds over baseline.

overnight-cascade.sh: Multi-model cascade runner for autonomous
research. Spawns parallel agents, collects observations, supervisor
analyzes agreement/disagreement patterns. Codex or local supervisor.

Both run on local Ollama models — zero Claude credits overnight.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Autonomous research loop comparing stock vs RAG vs foveated context:
- 15 workspace QA questions with known ground truth
- Rotates conditions A (stock), B (RAG), C (foveated) each question
- Supervisor barge-in every 5 minutes: reads chain, analyzes patterns,
  writes updated guidance
- Thermal-aware: monitors CPU temp, pauses on throttle
- All observations in append-only JSONL chain file
- Summary with differentials on Ctrl+C

Running overnight on gemma4:26b with CogOS kernel for foveated context.
Zero Claude credits. First real test of the EA/EFM thesis prediction
that boundary quality > model size.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two critical fixes:
1. Run ALL conditions (A/B/C) on each question before moving to
   next question. Previous version rotated conditions, meaning no
   question was ever tested under multiple conditions in the same
   cycle. Supervisor correctly identified this flaw.
2. Wrap supervisor barge-in in try/except so Ralph survives if the
   supervisor call times out or fails.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Single-file SPA with auto-refresh (15s):
- Condition score cards (Stock/RAG/Foveated with running averages)
- Score timeline scatter plot (Chart.js)
- Condition average bar chart
- Paired comparison table with differential bars (C-A, B-A)
- Live supervisor guidance panel
- Live log tail

Dark theme, monospace. Reads from Ralph's chain.jsonl.
python3 scripts/ablation-dashboard.py --port 8421

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Proper autoresearch setup mirroring the TRM loop structure:
- program.md: research brief with adaptive search strategy
- config.yaml: the single modifiable file (zone budgets, salience, TRM, iris)
- eval.py: fixed eval harness (15 questions, 3 conditions, context NDCG)
- results.tsv: append-only experiment log

Baseline results (think=False, no TRM weights):
  Stock:    0.267
  RAG:      0.211
  Foveated: 0.261
  C-A:      -0.006 (no improvement — foveated zones are empty)
  NDCG:     0.000 (engine selected zero correct documents)

Root cause: TRM weights not loaded, knowledge zones have structural
headers but no document content. RAG performs WORSE than stock because
grep keywords match wrong documents.

This is the baseline. The loop will optimize context assembly parameters
to maximize C-A differential.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrite README for clarity and approachability:
- One-sentence description, build command, port number in first 6 lines
- Real numbers: 0.900 NDCG, 2.3M params, 180+ experiments
- Universal ASCII architecture diagram
- Getting started guide, API table, project layout
- Remove metaphors, philosophy, and marketing language
- Link to design docs for deep-dive readers

Add CONTRIBUTING.md with dev setup, testing, and submission guidelines.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implements Provider interface against any OpenAI-compatible API server.
Supports both streaming (SSE) and non-streaming chat completions.
Auto-discovers LM Studio on localhost:1234 at startup.

Provider types in providers.yaml: openai-compat, lmstudio, vllm, llamacpp
All map to the same OpenAICompatProvider with sensible defaults.

Includes:
- Complete() and Stream() with full tool call support
- Available(), Ping(), Capabilities() probing
- 18 unit tests + integration test suite
- Zero CGO dependencies, cross-compiles to Windows

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- extract-signals.py: idempotent signal extraction from Claude Code sessions
  (crystallization, cascade, provenance, behavioral features)
- nightly-consolidation.py: 5-stage closed loop (LOG→INDEX→LEARN→DEPLOY→REPORT)
- survey-traces.py: corpus survey tool (parsed 200 sessions, 1910 exchanges)
- eval.py v2: 4-condition evaluation with cosine baseline, real CogDoc sources
- context_assembly.go: fix relative path resolution for TRM-scored documents
- results.tsv: baseline measurements (stock=0.267, foveated=0.294)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Update CI to Go 1.25, add go vet and golangci-lint jobs, trigger on all pushes
- Fix Available() returning true when configured model is not loaded on server
- Add Unreleased section to CHANGELOG.md with recent additions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents the complete evaluation story:
- 4-layer methodology (validation NDCG, cosine baseline, context judge, response eval)
- Training data: 2,298 signals from 805 Claude Code sessions
- 508 total experiments (439 Mamba + 69 transformer)
- Honest numbers: 0.878 mean NDCG@10 (0.900 peak is +4.4 sigma outlier)
- Known limitations: single workspace, MPS non-determinism, small validation set
- How to reproduce

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Link to docs/EVALUATION.md for full methodology.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three FCE bugs found by debug trace:
- Raw TRM scores (-9 to +0.5) destroyed cosine signal in 60/40 blend.
  Now sigmoid-normalize to [0,1] with 70/30 cosine/TRM weights.
- Multiple chunks from same file returned as separate results.
  Added path-level dedup (keep highest-scoring chunk per path).
- Accept both "query" and "prompt" fields in foveated API.

Result: TRM-scored docs now return 10/10 instead of 0/N for most queries.
Manfred Eigen docs surface correctly in top 10 for relevant queries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove eval-details.json (private workspace doc titles leaked)
- Remove .gotmp/ build cache (binary artifacts)
- Replace /Users/slowbro/ hardcoded paths with $HOME equivalents
- Add eval-details.json and .gotmp/ to .gitignore

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
First implementation of a ModalityModule backed by HTTP instead of subprocess
IPC. Wraps an external service (like Mod³) as a standard Gate/Decoder/Encoder
module on the kernel's modality bus.

- HTTPModule implements full ModalityModule interface
- httpEncoder: POST /v1/synthesize → EncodedOutput (WAV bytes)
- httpGate: POST /v1/vad → GateResult (speech detection)
- Health polling with 3-failure degradation escalation
- Start: polls /health until reachable, Stop: POST /shutdown
- 359 lines, stdlib only, compiles clean

The bus routes through this identically to subprocess modules — same
interface, different transport. Any HTTP service with /health + transform
endpoints can be wrapped as a ModalityModule.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…oint

Pipeline instantiates in cmdServeForeground(), auto-discovers voice
modules from service CRDs (bus.advertise: true), registers HTTPModule,
and exposes /v1/speak for kernel-routed voice output. Pipeline status
visible on /health endpoint.

- modality_serve.go: registerHTTPModules() + handleSpeak()
- service_crd.go: add Modalities []string to CRD spec
- serve.go: pipeline field on serveServer + route registration
- serve_daemon.go: step 6d — pipeline lifecycle in daemon
- serve_context.go: pipeline status on /health

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merge the public fork's 46 commits (docs, CI, TRM eval, autoresearch,
e2e tests, internal/engine layout) with the local 45 commits (modality
pipeline, service mesh, foveated context, session lifecycle).

Conflict resolution:
- .gitignore: combined both (local patterns + remote coverage/eval)
- Dockerfile: local build (CGO/FTS5) + remote port (6931)
- Makefile: local targets + remote Docker/e2e/bench targets
- README.md: accepted remote (polished public docs)
- go.mod/go.sum: kept local (active code depends on these deps)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Run go mod tidy to resolve new dependencies from remote's internal/engine
package. Fix TestClaudeCodeBuildPromptIncludesMultipleUserTurns to match
actual prompt format ([user]: prefix, not "User Turn" heading).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Integrate upstream fixes (golangci-lint v7, lint issue resolutions,
model weight gitignore, PII removal) with fork's modality pipeline.

Conflicts resolved:
- ci.yml: take upstream's golangci-lint-action@v7 (v2.11)
- .gitignore: merge both sides (root-level binaries + model weights + OS ignores)
- provider_prompt_test.go: take upstream's improved assertion message

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@chazmaniandinkle chazmaniandinkle merged commit b7f61a2 into cogos-dev:main Apr 13, 2026
3 of 5 checks passed
chazmaniandinkle referenced this pull request in chazmaniandinkle/cogos Apr 14, 2026
Add ResolveToFieldKey and FieldKeyToURI — the two directions of the
holographic pointer. Any URI form (cog://, cog:, memory-relative,
absolute path) normalizes to the field's canonical key. The inverse
projects field keys back to portable cog:// URIs.

Fixes Codex review finding #2: attention.boost now resolves URIs
before calling Field().Boost(), so boosts via MCP actually match
field entries. query_field returns proper cog:// URIs instead of
raw filesystem paths.

Fixes review finding #3: node probes now run concurrently with
goroutines. Total wall time is ~2s regardless of service count
instead of 2s × N services blocking the heartbeat loop.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
chazmaniandinkle added a commit that referenced this pull request Apr 21, 2026
Track 1 of the Windows dev-preview rollout (PRs #1 and #2 in Agent K's
release audit).

Makefile:
- Add windows-amd64 and windows-arm64 cross-compile targets, mirroring
  release.yml exactly (CGO_ENABLED=0, ./cmd/cogos/ entry point, .exe
  suffix). Extends the existing PLATFORMS list rather than introducing
  a new pattern.

docs/RELEASING.md:
- Add a "Installing on Windows (developer preview)" section covering
  PowerShell download, SmartScreen "More info -> Run anyway" workaround
  for the unsigned binary, %LOCALAPPDATA%\cogos\ install path with User
  PATH update, and a version/serve/health sanity check using the actual
  subcommands exposed by internal/engine/cli.go.
- Note that Windows Service / SCM integration is a follow-up, not part
  of v0.x.

No version tag is cut; no .go code is touched. go build and go vet are
both clean. Windows targets were verified locally to produce valid
PE32+ executables (amd64 and arm64).
chazmaniandinkle pushed a commit that referenced this pull request Apr 22, 2026
… gap

Closes Agent F gap #3 (session management, CRITICAL) — the last of the
eight critical MCP surface gaps. Implements the hybrid design in
cog://mem/semantic/surveys/2026-04-21-consolidation/
agent-P-session-management-evaluation with a few user-approved amendments
(see below).

Kernel changes
--------------
- internal/engine/sessions.go: typed SessionState, SessionRegistry,
  HandoffState, HandoffRegistry with RWMutex / Mutex guards; session-id
  format validation; idempotent-register "update semantics"; atomic
  first-wins claim with TTL enforcement; replay-from-bus at startup so
  the in-memory view rebuilds from bus_sessions + bus_handoffs.
- internal/engine/serve_sessions_mgmt.go: 8 HTTP handlers —
    POST /v1/sessions/register
    POST /v1/sessions/{id}/heartbeat
    POST /v1/sessions/{id}/end
    GET  /v1/sessions/presence
    POST /v1/handoffs/offer
    GET  /v1/handoffs
    POST /v1/handoffs/{id}/claim
    POST /v1/handoffs/{id}/complete
  The existing /v1/sessions and /v1/sessions/{id}[/context] routes (TAA
  inference context, regression-locked) are preserved untouched; the new
  specific patterns coexist thanks to Go 1.22 method-aware routing.
- internal/engine/mcp_sessions.go: 8 cog_* MCP tools over the same
  registries so a future native client (Wave widget, desktop app, cog
  CLI) can use handoff without the Python bridge (amendment #5 — two
  MCP surfaces coexist by design).
- internal/engine/sessions_test.go: 15 unit + integration tests
  (validation, lifecycle 404/409, active-window presence, task-field
  validation, 8-way concurrent claim atomicity, TTL expiry, phantom
  offer, complete-without-claim, replay rebuilds state, claim_rejected
  observability, end-to-end MCP round-trip).

Amendments applied vs the survey
--------------------------------
1. No parallel coexistence. All four consumers are in-tree (this PR,
   the bridge on a local branch, the skill doc, and cmd_bus.go);
   migrated atomically. The survey's Open Question #1 was skipped.
2. Idempotent register = update-semantics (survey's Open Question #2
   recommendation). Re-register during the active window updates the
   in-memory row; re-register after end is allowed if the prior row is
   ended or its heartbeat is outside the active window.
3. `handoff.claim_rejected` event emitted on every rejected claim, with
   reason ∈ {already_claimed, ttl_expired, offer_not_found,
   out_of_order}, attempting_session, and conflicting_session when
   relevant. Cheap; big audit value (amendment #4).
4. Two MCP surfaces coexist by design — 8 cogos_* bridge tools over the
   Python sandbox + 8 cog_* kernel-native tools via /mcp. Both hit the
   same kernel registries (amendment #5).

Bridge migration
----------------
A paired local branch on cog-sandbox-mcp (`feat/sessions-kernel-native-
bridge`, NOT pushed) refactors the 8 cogos_* tools to shim over the new
kernel routes, removes client-side aggregation, and rewrites
tests/test_session_handoff.py for the new wire shape. Bridge MCP
signatures and the never-raise {"success": False, "error": ..., "bus_id"}
envelope are preserved — no breaking change for agents using the bridge.

Testing
-------
- `go build ./...`, `go vet ./...`: silent.
- `go test ./internal/engine/... -short -race -count=1`: green
  (pre-existing + new suite passes under race detector).
- End-to-end smoke on port 6932 with a test workspace: register →
  heartbeat → offer → list → claim → second claim (→ 409 +
  claim_rejected event) → complete. bus_sessions chain: 3 events.
  bus_handoffs chain: 4 events (offer, claim, claim_rejected,
  complete). Bridge tools replayed the same flow against the live
  kernel with back-compat response shapes intact.
- The running kernel at :6931 was NOT touched during this work.

Survey reference: cog://mem/semantic/surveys/2026-04-21-consolidation/
agent-P-session-management-evaluation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant