CacheGuard - Smart Context Management for SillyTavern

Made with long-form roleplays in mind! CacheGuard intelligently manages your context window to make sure your extended roleplays will remain usable with minimal performance impact!

The Problem

When your context window fills up, every new message forces the LLM provider to remove old messages from the start of the conversation. This causes prompt cache invalidation - the cached prefix becomes invalid, and your fast generations suddenly become slow. Permanently. You go from quick responses to long waits in an instant, and it never recovers because every message shifts the context again.

The Solution

CacheGuard automatically:

Truncates old messages at a configurable threshold while preserving recent context
Summarizes excluded messages into compact notes that maintain story continuity
Retrieves semantically relevant memories from your conversation history using vector search
Auto-calibrates to optimally fill your context window without overflowing

Key Features

🎯 Smart Truncation - Automatically removes old messages while keeping a configurable number of recent ones
📝 Auto-Summarization - Generates concise summaries of excluded messages using your preferred LLM endpoint
🧠 Vector Memory - Qdrant-powered semantic search retrieves relevant past events when contextually appropriate
📊 Visual Dashboard - Real-time context utilization gauge and breakdown by category
⚙️ Auto-Calibration - Self-tuning algorithm learns your optimal context size over a few generations
🔌 LoreVault Compatible - Automatically tracks LoreVault memory tokens in the context breakdown

Quick Start

By default, everything is enabled to work out-of-the-box!
Optionally configure Auto-Summarize with an OpenAI-compatible endpoint to allow for summarizations to happen, instead of truncated messages to be dropped completely.
For vector memory, configure Qdrant connection in the Qdrant Memory tab.
LoreVault simply has to be enabled in it's own extensions' settings.

Credits & Acknowledgments

This extension builds upon excellent prior work:

st-qdrant-memory by HO-git - Vector memory architecture and Qdrant integration patterns
SillyTavern-MessageSummarize by Qvink - Summarization system design and message processing logic

Their open-source contributions made this extension possible. 🙏

License

MIT License - See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 193 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.js		index.js
manifest.json		manifest.json
settings.html		settings.html
style.css		style.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CacheGuard - Smart Context Management for SillyTavern

The Problem

The Solution

Key Features

Quick Start

Credits & Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CacheGuard - Smart Context Management for SillyTavern

The Problem

The Solution

Key Features

Quick Start

Credits & Acknowledgments

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages