Awful Jade (aka aj) is your command-line sidekick for working with Large Language Models (LLMs).
Think of it as an LLM Swiss Army knife with the best intentions π.
Ask questions, run interactive sessions, sanitize messy OCR book dumps, synthesize exam questions, all without leaving your terminal.
It's built in Rust for speed, safety, and peace of mind. π¦
Ξ» aj --help
Awful Jade β a CLI for local LLM tinkering with memories, templates, and vibes.
Usage: aj <COMMAND>
Commands:
ask Ask a single question and print the assistant's response
interactive Start an interactive REPL-style conversation
init Initialize configuration and default templates in the platform config directory
reset Reset the database to a pristine state
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
-V, --version Print version
- Ask the AI: Run
aj ask "question"and get answers powered by your configured model. - Interactive Mode: A REPL-style conversation with memory & vector search (your AI "remembers" past context).
- RAG Support: Load documents for context-aware responses with
--ragflag. Automatic chunking, caching, and retrieval. - Pretty Printing: Beautiful markdown rendering with syntax highlighting for code blocks (
--prettyflag). - Progress Indicators: Real-time feedback with spinners for API calls, memory search, and model loading.
- Vector Store: Uses HNSW + sentence embeddings to remember what you've said before. Basically, your AI gets a brain. π§
- Brains with Limits: Keeps only as many tokens as you allow. When full, it forgets the oldest stuff. (Like you after 3 AM pizza.)
- Config & Templates: YAML-driven configs and prompt templates. Customize everything, break nothing.
- Auto-downloads embeddings model: Uses Candle (pure Rust ML framework) to automatically download the
all-MiniLM-L6-v2BERT model from HuggingFace Hub when needed. - Pure Rust: No Python dependencies! Everything runs in pure Rust using Candle for ML inference.
From crates.io:
cargo install awful_ajThis gives you the aj binary.
Requirements:
- Rust (use rustup if you don't have it).
The embeddings model (all-MiniLM-L6-v2) will be downloaded automatically from HuggingFace Hub to your system's cache directory when first needed. Models are cached using the hf-hub crate, typically at:
- macOS:
~/.cache/huggingface/hub/ - Linux:
~/.cache/huggingface/hub/ - Windows:
C:\Users\YOU\AppData\Local\huggingface\hub\
No special setup required! Just install with cargo install awful_aj and you're ready to go.
The embeddings model will be downloaded automatically from HuggingFace Hub the first time you use a feature that requires it.
Create default configs, templates, and database:
aj init
This will generate:
config.yamlwith sensible defaultstemplates/default.yamlandtemplates/simple_question.yaml- A SQLite database (
aj.db) for sessions in your config directory
Options:
--overwrite: Force overwrite existing config, templates, and database files
Example:
aj init --overwrite # Reinitialize everything from scratchaj ask "Is Bibi really from Philly?"
You'll get a colorful, model-dependent answer in yellow (or dark gray if the model uses <think> tags for reasoning).
Options:
-t, --template <NAME>: Use a specific template (e.g.,simple_question)-s, --session <NAME>: Save to a named session for context retention-o, --one-shot: Ignore any session configured in config.yaml (force standalone prompt)-r, --rag <FILES>: Comma-separated list of files to use as RAG context-k, --rag-top-k <N>: Number of RAG chunks to retrieve (default: 3)-p, --pretty: Enable markdown rendering with syntax highlighting
Examples:
aj ask "What is HNSW?"
aj ask -t simple_question "Explain Rust lifetimes"
aj ask -s my-session "Remember this: I like pizza"
aj ask -o "What's the weather?" # Ignores session from config
aj ask -r docs.txt,notes.md -k 5 "Summarize the key points"
aj ask -p "Explain this code" # Pretty markdown output
aj ask -r docs/ -p -s project "What does this project do?"Talk with the AI in an interactive REPL:
aj interactive
Supports memory via the vector store, so it won't immediately forget your name. (Unlike your barista.)
Colors:
- Your input appears in blue
- Assistant responses appear in yellow
- Model reasoning (in
<think>tags) appears in dark gray
Options:
-t, --template <NAME>: Use a specific template-s, --session <NAME>: Use a named session-r, --rag <FILES>: Comma-separated list of files for RAG context (loaded once for entire session)-k, --rag-top-k <N>: Number of RAG chunks to retrieve (default: 3)-p, --pretty: Enable markdown rendering with syntax highlighting for all responses
Examples:
aj interactive
aj interactive -s my-session
aj interactive -t reading_buddy -s book-club
aj interactive -r docs/ -p -s project # Interactive with RAG and pretty outputStart fresh by resetting the database to a pristine state:
aj reset
This drops all sessions, messages, and recreates the schema. Useful when you want a clean slate.
Aliases: aj r
Edit your config at:
~/.config/aj/config.yaml # Linux
~/Library/Application Support/com.awful-sec.aj/config.yaml # macOS
Example:
api_base: "http://localhost:1234/v1"
api_key: "CHANGEME"
model: "jade_qwen3_4b_mlx"
context_max_tokens: 8192
assistant_minimum_context_tokens: 2048
stop_words:
- "<|im_end|>\\n<|im_start|>"
- "<|im_start|>\n"
session_db_url: "/Users/you/Library/Application Support/com.awful-sec.aj/aj.db"
session_name: "default" # Set to null for no session persistence
should_stream: true # Enable streaming responsesLoad documents as context for your queries:
aj ask -r document.txt "What are the main points?"
aj ask -r docs/,notes.md -k 5 "Summarize these files"How it works:
- Documents are automatically chunked (512 tokens, 128 overlap)
- Chunks are embedded and cached in
~/.config/aj/rag_cache/(or platform equivalent) - Top-k most relevant chunks are retrieved and injected into context
- Cache is reused for faster subsequent queries on the same files
Cache location:
- macOS:
~/Library/Application Support/com.awful-sec.aj/rag_cache/ - Linux:
~/.config/aj/rag_cache/ - Windows:
%APPDATA%\com.awful-sec\aj\rag_cache\
Templates are YAML files in your config directory. Here's a baby template:
system_prompt: "You are Awful Jade, a helpful AI assistant programmed by Awful Security."
messages: []
response_format: null
pre_user_message_content: null
post_user_message_content: nullAdd more, swap them in with -t <name> or --template <name>.
- Brain: Token-budgeted working memory with FIFO eviction. Keeps memories in a deque, trims when it gets too wordy.
- VectorStore: Embeds your inputs using
all-MiniLM-L6-v2via Candle (pure Rust ML), saves to HNSW index for semantic search. - RAG System: Intelligent document chunking (512 tokens, 128 overlap), embedding caching, and k-nearest neighbor retrieval.
- Pretty Printing: Markdown rendering with syntax highlighting for 100+ languages using Syntect.
- Progress UI: Real-time spinners and feedback using Indicatif (API calls, memory search, model loading).
- Candle: Pure Rust ML framework from HuggingFace. Automatically downloads and caches models from HuggingFace Hub.
- Config: YAML-based, sane defaults, easy to tweak.
- Templates: Prompt engineering without copy-pasting into your terminal like a caveman.
- No Python: Everything runs in pure Rust with no external ML runtime dependencies.
Clone, hack, repeat:
git clone https://github.com/graves/awful_aj.git
cd awful_aj
cargo build
Run tests:
cargo test
PRs welcome! Bugs, docs, new templates, vector hacksβbring it on. But remember: with great power comes great YAML.
CC-BY-SA-4.0 (Creative Commons Attribution-ShareAlike 4.0 International)
Share and adapt freely, but give credit and share alike. Don't blame us when your AI remembers your browser history.
π‘ Awful Jade: bad name, good brain.

