TranscriptAI

Automatic topic segmentation and analysis for VTT transcripts, powered by local LLMs via Ollama.

What it does

Give it a .vtt transcript file and it will:

Detect topic boundaries using embedding-based DeepTiling — no LLM calls needed for segmentation
Analyze each topic with a single LLM call — titles, summaries, keywords, key quotes, and subtopics
Output structured JSON ready for downstream consumption
Visualize topics on a timeline

A 5-minute transcript processes in ~30 seconds (0.10x real-time factor) on Apple Silicon with gemma3:4b.

How it works

Boundary detection uses a DeepTiling algorithm:

Sliding window embeddings via nomic-embed-text
Cosine similarity curves between left/right context windows
Depth score computation to find topic transitions
Automatic merging of short segments using cached embeddings (zero additional API calls)

Analysis sends all topics to the LLM in a single call, producing:

Specific, descriptive titles using proper nouns
Narrative summaries explaining arguments and tensions
Named entity keywords (not generic category words)
Verbatim key quotes with context explaining why they matter

Requirements

Python 3.x
Ollama running locally
Models: gemma3:4b (analysis) + nomic-embed-text (embeddings)

pip install requests matplotlib pandas numpy
ollama pull gemma3:4b
ollama pull nomic-embed-text

Usage

# Basic analysis
python main.py transcript.vtt

# With visualization
python main.py transcript.vtt --visualize

# Custom output path
python main.py transcript.vtt -o analysis.json

# List available models
python main.py transcript.vtt --list-models

CLI Options

Flag	Default	Description
`--output`, `-o`	`vtt_analysis.json`	Output JSON path
`--visualize`, `-v`	off	Generate timeline visualization
`--visualization-output`	`topic_timeline.png`	Visualization output path
`--llm-url`	`http://localhost:11434`	Ollama API URL
`--model`	`gemma3:4b`	LLM model for analysis
`--num-ctx`	`3072`	LLM context window size
`--max-workers`	`4`	Concurrent LLM requests (fallback mode)
`--boundary-method`	`embedding`	`embedding`, `llm`, or `hybrid`
`--boundary-sensitivity`	`0.1`	Higher = fewer boundaries (0.0–2.0)
`--embedding-model`	`nomic-embed-text`	Model for embeddings

Example Output

{
  "topics": [
    {
      "title": "Washaway Beach: A Cycle of Loss",
      "summary": "A coastal town faces devastating property loss due to rising sea levels...",
      "keywords": ["Washaway Beach", "coastal erosion", "home prices"],
      "key_quotes": [
        {
          "quote": "Year after year, more homes would fall in the water.",
          "context": "Establishes the scale and ongoing nature of the erosion crisis."
        }
      ],
      "subtopics": [...],
      "timespan": "0:00:16 - 0:00:49",
      "start_seconds": 16.66,
      "end_seconds": 49.659
    }
  ],
  "summary": {
    "total_duration": "0:05:09",
    "topic_count": 7
  }
}

Timeline Visualization

Architecture

Component	Role
`VTTParser`	Parses VTT files into timestamped segments
`EmbeddingBoundaryDetector`	DeepTiling boundary detection with embedding cache
`LLMClient`	Ollama API interface with connection pooling and retry
`VTTAnalyzer`	Orchestrates the full pipeline
`TopicSegment`	Data model for analyzed topics

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
example-results		example-results
.gitignore		.gitignore
README.md		README.md
main.py		main.py
transcript-ai.png		transcript-ai.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TranscriptAI

What it does

How it works

Requirements

Usage

CLI Options

Example Output

Timeline Visualization

Architecture

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TranscriptAI

What it does

How it works

Requirements

Usage

CLI Options

Example Output

Timeline Visualization

Architecture

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages