Skip to content

feat: self-contained PoC notebook for intelligent CC suggestion#11

Open
bhuvan-somisetty wants to merge 1 commit into
PlanetRead:mainfrom
bhuvan-somisetty:feat/poc-notebook-demo
Open

feat: self-contained PoC notebook for intelligent CC suggestion#11
bhuvan-somisetty wants to merge 1 commit into
PlanetRead:mainfrom
bhuvan-somisetty:feat/poc-notebook-demo

Conversation

@bhuvan-somisetty
Copy link
Copy Markdown

@bhuvan-somisetty bhuvan-somisetty commented May 8, 2026

Adds a single Jupyter notebook (poc_demo.ipynb) that demonstrates the full CC suggestion pipeline end-to-end, covering all three goals from issue #2.

The notebook only needs numpy to run - no TensorFlow, MediaPipe, or ffmpeg required. The ML inference calls are replaced with realistic sample data so reviewers can execute every cell without model downloads or a GPU. The decision logic, label mapping, SRT/SLS formatting, and evaluation are real working code throughout.

The pipeline runs in four stages, each explained in a markdown section before the code:

  1. Sound event detection - simulates YAMNet patch scoring, filters speech labels, drops events below the 0.35 confidence threshold, and merges adjacent same-label detections.
  2. Visual reaction analysis - simulates optical flow motion scores and MediaPipe face-shift scores for the reaction window (300–1500 ms after each event).
  3. CC decision engine - category-aware weighted fusion with a +0.12 boost for high-impact events. Weights are 85/15 for high-impact sounds (trust audio), 30/70 for ambient sounds (require visual confirmation).
  4. Output and evaluation - generates SRT and SLS files and measures precision/recall/F1 against a ground-truth JSON using IoU-based matching.

The sample data is chosen to demonstrate all filtering cases: speech suppressed, low-confidence events dropped, adjacent detections merged, ambient music and a dog bark suppressed by the decision threshold, and India-specific labels (Fireworks → [firecrackers], Tabla → [tabla]) preserved correctly.

Fixes #2

Signed-off-by: bhuvan-somisetty <somisettybhuvan5@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DMP 2026]: Create Intelligent Closed Caption (CC) Suggestion Tool

1 participant