Summary
After memory extraction via session.commit(), the semantic processor generates .overview.md for the parent memory directory. When this overview text exceeds the embedding model's context length, OpenAIDenseEmbedder.embed() raises an unhandled RuntimeError. This exception appears to block the uvicorn event loop, causing the entire HTTP server to become unresponsive (process alive, port open, but all endpoints hang).
Environment
- OpenViking: installed via pipx (latest as of 2026-03-17)
- OS: macOS arm64 (Darwin 25.3.0)
- Embedding: Ollama
nomic-embed-text (8192 token context, 768 dim) via OpenAI-compatible API
- VLM: Bailian
qwen3-max
- Mode: local, bound to
127.0.0.1:1933
- Integration: OpenClaw
memory-openviking plugin
Steps to Reproduce
- Accumulate enough memories in a directory (e.g.
viking://user/default/memories/preferences)
- Trigger a new memory extraction (e.g. via session commit from OpenClaw)
- Memory extractor writes the new memory file successfully
- Semantic processor runs on the parent directory (
recursive=False)
- Generated
.overview.md aggregates all file summaries → text exceeds embedding model's token limit
- Embedding queue calls
OpenAIDenseEmbedder.embed() with the oversized text
- Ollama returns HTTP 400:
the input length exceeds the context length
embed() raises RuntimeError → collection_schemas.py:on_dequeue propagates the exception
- Server hangs: all HTTP endpoints stop responding,
curl times out
Relevant Logs
INFO - Processing semantic generation for: viking://user/default/memories/preferences (recursive=False)
WARNING - Candidate data is None for label index 4 (label: ...), skipping.
INFO - Created memory file: viking://user/default/memories/preferences/mem_04c2ef28-...md
INFO - Enqueued memory for vectorization
stderr:
openai.BadRequestError: Error code: 400 - {'error': {'message': 'the input length exceeds the context length', ...}}
RuntimeError: OpenAI API error: Error code: 400 - ...
After this error, no further log output appears and all HTTP requests time out.
Root Cause Analysis
Two issues combine:
-
No input truncation guard in OpenAIDenseEmbedder.embed() (openai_embedders.py): text is passed directly to the API without any length check. When the embedding model has a limited context window, oversized input causes a hard API error.
-
Unhandled exception in embedding queue blocks uvicorn: the RuntimeError from the embedder propagates through collection_schemas.py:on_dequeue and appears to block or crash the async event loop, making the entire server unresponsive.
Expected Behavior
- Embedding input should be truncated (or chunked) before being sent to the provider
- Embedding failures should be caught gracefully without blocking the HTTP server
- The server should remain responsive even if individual vectorization tasks fail
Workaround
Monkey-patched OpenAIDenseEmbedder.embed() to truncate input to 24000 chars (~6000-8000 tokens) before calling the API. Server remains stable after the patch.
Related Issues
Summary
After memory extraction via
session.commit(), the semantic processor generates.overview.mdfor the parent memory directory. When this overview text exceeds the embedding model's context length,OpenAIDenseEmbedder.embed()raises an unhandledRuntimeError. This exception appears to block the uvicorn event loop, causing the entire HTTP server to become unresponsive (process alive, port open, but all endpoints hang).Environment
nomic-embed-text(8192 token context, 768 dim) via OpenAI-compatible APIqwen3-max127.0.0.1:1933memory-openvikingpluginSteps to Reproduce
viking://user/default/memories/preferences)recursive=False).overview.mdaggregates all file summaries → text exceeds embedding model's token limitOpenAIDenseEmbedder.embed()with the oversized textthe input length exceeds the context lengthembed()raisesRuntimeError→collection_schemas.py:on_dequeuepropagates the exceptioncurltimes outRelevant Logs
stderr:
After this error, no further log output appears and all HTTP requests time out.
Root Cause Analysis
Two issues combine:
No input truncation guard in
OpenAIDenseEmbedder.embed()(openai_embedders.py): text is passed directly to the API without any length check. When the embedding model has a limited context window, oversized input causes a hard API error.Unhandled exception in embedding queue blocks uvicorn: the
RuntimeErrorfrom the embedder propagates throughcollection_schemas.py:on_dequeueand appears to block or crash the async event loop, making the entire server unresponsive.Expected Behavior
Workaround
Monkey-patched
OpenAIDenseEmbedder.embed()to truncate input to 24000 chars (~6000-8000 tokens) before calling the API. Server remains stable after the patch.Related Issues
add-resource(closed, only addressed that path)