feat: Add Sound Event Detection (SED) module using YAMNet backbone by Alokkohli200 · Pull Request #14 · PlanetRead/Intelligent-cc-generation

Alokkohli200 · 2026-05-09T16:46:42Z

Overview

I have implemented the Sound Event Detection (SED) module as part of the Intelligent CC Suggestion Tool. This module is designed to identify contextually significant non-speech audio events to enhance accessibility for regional content.

Technical Implementation

Model: Utilized the YAMNet architecture via TensorFlow Hub, chosen for its high accuracy in environmental sound classification.
Pipeline: Developed a robust extraction process that converts video audio to 16kHz mono waveforms as required by the model.
Filtering Logic: Implemented a confidence-based filter (threshold 0.25) to ignore ambient noise (Silence, White noise) while capturing impactful events like footsteps, music, or sirens.
Environment: Optimized and verified the pipeline for Apple Silicon (M4) using Python 3.12, with specific dependency pinning (protobuf<5) to ensure compatibility.

Demo & Verification

Demo Video: https://www.youtube.com/watch?v=JVOTMxdUcbs
Sample Output: The module successfully generates a timestamped table of events with confidence scores.

Future Improvements

Implement temporal smoothing to merge consecutive short-window detections into single, continuous captions.
Integration with the SRT generator module for automated caption insertion.

Alokkohli200 added 2 commits May 9, 2026 21:54

feat: Add Sound Event Detection module using YAMNet

ee1514f

docs: update README details

5e57dde

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add Sound Event Detection (SED) module using YAMNet backbone#14

feat: Add Sound Event Detection (SED) module using YAMNet backbone#14
Alokkohli200 wants to merge 2 commits into
PlanetRead:mainfrom
Alokkohli200:feature/sed-module-alok

Alokkohli200 commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Alokkohli200 commented May 9, 2026

Overview

Technical Implementation

Demo & Verification

Future Improvements

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant