Bug Description
When OpenClaw uses OpenViking as the context engine, answer-time auto-recall may fail because the retrieval query is built directly from the latest user text or prompt, and that query can exceed the embedding model's max input length.
This is a different path from the already-discussed oversized embedding issues in memory commit or add-resource. Here the failure happens during answer-time recall / retrieval, so users see OpenClaw fail while answering.
Steps to Reproduce
- Enable the
examples/openclaw-plugin integration and use OpenViking as the context engine.
- Send a very long user message or prompt that is later used as the recall query in
before_prompt_build.
- Let the plugin trigger auto-recall before answer generation.
- The retrieval/query vectorization path forwards the oversized query to the embedding model.
- If the embedding provider has a strict max input length, the answer flow fails with a token-limit / oversized-input error.
Expected Behavior
OpenClaw auto-recall should sanitize and cap oversized recall queries before they reach the embedding provider, so answer generation remains stable even when the latest user prompt is very long.
Actual Behavior
OpenClaw answer-time recall can fail because the retrieval query is too large for the embedding model, resulting in token-limit / oversized-input errors during the search path.
Minimal Reproducible Example
# Repro is configuration-driven rather than a small Python snippet:
# 1. Configure OpenClaw to use OpenViking as context engine
# 2. Send a very long user prompt
# 3. Observe recall/search failure before the model answer is produced
Error Logs
Typical symptom from embedding providers:
- input length exceeds the context length
- input sequence length exceeds the max input length of embedding model
- request rejected because the embedding input is too large
OpenViking Version
Observed on current OpenClaw/OpenViking integration path in April 2026; exact affected version range likely includes recent 0.3.x releases.
Python Version
Unknown / varies by user environment
Operating System
Other
Model Backend
Other
Additional Context
Related but not identical issues:
A proposed fix already exists in PR #1297:
The key distinction is that this report is about the search / recall query path used while OpenClaw is answering, not about memory commit or resource ingestion.
Bug Description
When OpenClaw uses OpenViking as the context engine, answer-time auto-recall may fail because the retrieval query is built directly from the latest user text or prompt, and that query can exceed the embedding model's max input length.
This is a different path from the already-discussed oversized embedding issues in memory commit or add-resource. Here the failure happens during answer-time recall / retrieval, so users see OpenClaw fail while answering.
Steps to Reproduce
examples/openclaw-pluginintegration and use OpenViking as the context engine.before_prompt_build.Expected Behavior
OpenClaw auto-recall should sanitize and cap oversized recall queries before they reach the embedding provider, so answer generation remains stable even when the latest user prompt is very long.
Actual Behavior
OpenClaw answer-time recall can fail because the retrieval query is too large for the embedding model, resulting in token-limit / oversized-input errors during the search path.
Minimal Reproducible Example
Error Logs
OpenViking Version
Observed on current OpenClaw/OpenViking integration path in April 2026; exact affected version range likely includes recent 0.3.x releases.
Python Version
Unknown / varies by user environment
Operating System
Other
Model Backend
Other
Additional Context
Related but not identical issues:
A proposed fix already exists in PR #1297:
The key distinction is that this report is about the search / recall query path used while OpenClaw is answering, not about memory commit or resource ingestion.