Skip to content

Chen17-sq/clearscript

clearscript — local-first ASR transcript editor with a compounding terminology library

MIT License Python 3.11+ v0.0.10 CI Simplified Chinese

Local-first · Bring your own model · Compounding library


You just spent 45 minutes fixing the same transcript errors. Again.

Your ASR tool gives you 95% of the transcript right. The other 5% is mind-numbing:

  • "Speaker 2" never gets a real name
  • "Dify" came back as "DeFi"
  • "PingCAP" came back as "PinkCup"
  • The mic-check pleasantries you delete every time
  • The same misheard jargon you fixed last week

Then tomorrow you'll do another interview and start over.

clearscript watches you fix it once. Next time, it remembers.

Tip

Drop in a raw transcript → pick any model → click Run → edit inline → download .docx. Your fixes go into a local terminology library. Run #10 barely needs touching.

This started as a personal Claude skill that ran on a few hundred VC reference checks, founder interviews, board meetings, and podcast cleanups. The library learned. The 45 minutes shrank. Now it's MIT-licensed, local-first, and yours to use.

🟥  Local-first

Transcripts and the terminology library live on your disk. No accounts, no telemetry, no cloud dependency. The only network call is the one you authorize — your chosen LLM provider.

🟦  Bring your own model

Five adapters cover 20+ services: Anthropic · OpenAI · DeepSeek · Moonshot · Qwen · Together · Groq · Fireworks · Mistral · OpenRouter · Google Gemini · Ollama · llama.cpp · LM Studio · custom endpoints (incl. Colab tunnels).

🟨  Compounding library

A local SQLite knowledge base of terms, speakers, and edit patterns that grows with every session. Next session pre-loads the relevant subset automatically. Markdown views auto-export — human-readable and git-trackable.

Why it exists

Existing transcript-cleanup tools fall into two camps:

  • Cloud SaaS (Otter, Rev, Sonix): your audio and text get uploaded, processed by a closed model, and stored on someone else's servers. Privacy is a checkbox, not an architecture.
  • Generic LLM chats (paste into ChatGPT): every session starts from zero. The model has no memory of who your speakers are, what your industry's jargon looks like, or which corrections you've made a hundred times before.

clearscript is the third option:

Cloud SaaS Plain LLM chat clearscript
Data stays local
Bring your own model
Works offline (with local model)
Compounding terminology library
Reproducible / audit trail
Multi-format input/outputpartial

Quick start

Requires Python 3.11+ and uv.

git clone https://github.com/Chen17-sq/clearscript.git
cd clearscript
uv sync
export ANTHROPIC_API_KEY=sk-ant-...   # or DEEPSEEK_API_KEY / OPENAI_API_KEY / GEMINI_API_KEY

Option 1 — Web UI (recommended)

uv run clearscript serve

Opens http://127.0.0.1:7681 in your browser. Bauhaus-styled single-page interface: pick a provider pill, paste or drag in your transcript, click Clean transcript, download as .md / .docx.

Option 2 — CLI

uv run clearscript run examples/01-basic-cleanup/input.txt --provider claude

The cleaned transcript is written next to the input as input.cleaned.md, with a JSON change log alongside.

Prefer a different model?

# OpenAI-compatible (DeepSeek, Moonshot, Qwen, Together, Groq, Fireworks, Mistral, OpenRouter, ...)
export DEEPSEEK_API_KEY=sk-...
uv run clearscript run input.txt --provider deepseek

# 100% local (Ollama / llama.cpp server / LM Studio)
uv run clearscript run input.txt --provider ollama --model qwen2.5:14b

Status

v0.0.6 — pre-alpha. Local web UI ships a Bauhaus-styled Editor + Library + Projects tabs; multi-format ingest (.txt / .md / .docx / .srt / .vtt / .json); compounding terminology library with Mode A activation and Mode B harvest; every Run auto-saves as a project to ~/Documents/clearscript/projects/; long transcripts (60+ min) are auto-chunked at speaker boundaries and stitched back together. Full v0.1 plan: see ROADMAP.

Supported input formats today

Format Source examples
.txt Generic, with speaker-label heuristics
.md Auto-strips AI-summary blocks (English + Chinese)
.docx 飞书妙记 / 腾讯会议 / 通义听悟 / generic Word
.srt SubRip subtitle, time-stamped
.vtt WebVTT (honors <v Speaker> voice tags)
.json OpenAI Whisper / PLAUD / Google STT / Deepgram / generic flat list

Shipped in v0.0.1

  • 5 LLM provider adapters (20+ services)
  • .txt ingest with speaker heuristics
  • SQLite library: terms / aliases / speakers / patterns / sessions / negatives + FTS5
  • Single-pass pipeline (ingest → LLM → md/docx + JSON changelog)
  • CLI: run, providers, lib stats / add-term / lookup
  • Bundled prompt library (system + 7 stage prompts + 7 layer specs)
  • User overrides via ~/.config/clearscript/prompts/
  • Bilingual docs, MIT license, full GitHub templates
  • 27 unit tests, CI on macOS + Linux + Windows × Py 3.11/3.12/3.13

Coming in v0.1

  • 12 ASR input formats (Feishu Miaoji, Typeless, Tongyi Tingwu, Tencent Meeting, Yuanbao, PLAUD, SRT, VTT, JSON, HTML, LRC, etc.)
  • Full pipeline decomposition (pre-scan → context briefing → chunking → self-review → batch-ask → re-scan)
  • L3.5 sentence-level reasoning layer
  • SvelteKit web UI with Bauhaus design system
  • Library Mode A (project-start activation) and Mode C (in-flight learning)
  • PyInstaller-packaged desktop installers (.app · .exe · .AppImage)
  • MkDocs Material doc site auto-deployed to GitHub Pages

Design principles (non-negotiable)

These constrain every feature decision:

  • 🟥  No telemetry. Ever. Not opt-in, not anonymous, not "just for crashes."
  • 🟦  No mandatory network calls beyond the user's chosen LLM provider.
  • 🟨  No proprietary data formats. Markdown, SQLite, JSON — readable from outside clearscript forever.
  • ⬛  No assumed cloud services. If it requires an account somewhere, it's behind a setting that's off by default.

Project layout

clearscript/
├── src/clearscript/         # Python package
│   ├── core/                # Pipeline orchestration
│   ├── ingest/              # ASR-format parsers
│   ├── providers/           # LLM provider adapters
│   ├── library/             # Terminology knowledge base
│   ├── layers/              # Edit layers (L1-L6 + L3.5 + Self-review)
│   ├── export/              # Output formatters
│   ├── storage/             # Project filesystem layout
│   └── prompts/             # LLM prompt templates (markdown, user-overridable)
├── web/                     # SvelteKit web UI (v0.1 onward)
├── tests/                   # pytest suite
├── docs/                    # MkDocs site sources + design system + assets
└── examples/                # Synthetic before/after samples

Read docs/architecture.md for the full pipeline contract.

Use it your way

Form For who How to install
Local app (web UI) Anyone who wants the full editor — Library tab, Project history, inline diff, cost preview git clone … && uv sync && uv run clearscript serve
Claude skill Anyone using Claude Code or the Claude Agent SDK Download clearscript.skill from Releases → unzip into ~/.claude/skills/

Both share the exact same prompt library. Improvements to the standalone app's editing rules ship into the skill on the next release.

→ Read the manifesto for why local-first matters here.

Contributing

Issues and PRs welcome. See CONTRIBUTING.md. Easy ways to help:

  • Add an ASR format adapter — write a parser for a tool we don't yet support
  • Suggest ASR error patterns — open an issue with ASR original → correct pairs you've seen
  • Test with your own model — try a provider we haven't smoke-tested and report
  • Contribute a domain pack (post-v0.3) — industry-specific terminology bundles

clearscript logo

clearscript · Released under the MIT License · Built for people who care about their transcripts

About

Local-first ASR transcript editor with a compounding terminology library. Bring your own model.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages