Description
PR: #1373
Explanation: I am holding off on implementing the manual history tracking fix in this PR for StatefulExecutorBase. Rebuilding the KV cache manually for models that don't support native memory shifting is highly complex and risks silent state corruption, especially when dealing with session caching and multimodal inputs. Keeping the ContextOverflowException guard as-is forces the calling application to handle the overflow gracefully (e.g., starting a new chat) rather than risking a desynced model state. We can revisit this if we see high user impact, but stability comes first.
Description
PR: #1373
Explanation: I am holding off on implementing the manual history tracking fix in this PR for StatefulExecutorBase. Rebuilding the KV cache manually for models that don't support native memory shifting is highly complex and risks silent state corruption, especially when dealing with session caching and multimodal inputs. Keeping the ContextOverflowException guard as-is forces the calling application to handle the overflow gracefully (e.g., starting a new chat) rather than risking a desynced model state. We can revisit this if we see high user impact, but stability comes first.