Skip to content

[DMP 2026] Implement Sound Event Detection from Video #6

Open
Govindggupta wants to merge 2 commits into
PlanetRead:mainfrom
Govindggupta:gg/cc-generation
Open

[DMP 2026] Implement Sound Event Detection from Video #6
Govindggupta wants to merge 2 commits into
PlanetRead:mainfrom
Govindggupta:gg/cc-generation

Conversation

@Govindggupta
Copy link
Copy Markdown

Overview

This PR adds a focused MVP for Module 1: Sound Event Detection.

The demo takes a video or audio file as input, extracts/normalizes the audio, runs sound event detection using YAMNet, and exports detected non-speech events with timestamps and confidence scores in JSON and CSV formats.

This PR intentionally focuses only on Module 1. Visual reaction detection, CC/no-CC decision logic, and SRT/SLS generation are left for later modules.

What This PR Includes

  • CLI script to run sound event detection:
    • detect_sound_events.py
  • Audio extraction and normalization:
    • converts video/audio input to 16 kHz mono WAV
    • uses imageio-ffmpeg, so no system FFmpeg installation is required
  • YAMNet-based sound event detection:
    • detects AudioSet sound classes
    • filters detections by confidence
    • supports optional noisy-label blocking
    • merges nearby same-label events
  • JSON and CSV output generation
  • Simple CC-style label mapping:
    • example: Gunshot, gunfire[gunshot]
    • example: Machine gun[machine gun]
  • Sample input video and corresponding output files
  • README with setup, run command, MVP flow, output format, and limitations

Demo Command

python detect_sound_events.py \
  --input samples/test_video.mp4 \
  --json outputs/test_video_events.json \
  --csv outputs/test_video_events.csv \
  --min-confidence 0.5 \
  --block-label Animal,Bird

@Govindggupta Govindggupta changed the title Implementation : Sound Even Detection from video [DMP 2026] Implement Sound Event Detection from Video May 6, 2026
@Govindggupta
Copy link
Copy Markdown
Author

Related to #2

@Govindggupta
Copy link
Copy Markdown
Author

Govindggupta commented May 6, 2026

  1. A demonstration video screen-recording your code running end-to-end and showing the output — please attach it directly to the PR or share a link (Google Drive / YouTube unlisted)

As instructed this is the video code running locally on my machine
video is in the samples folder , in which i've used clip of game for high complexity

2026-05-07.02-06-54.mp4

@abinash-sketch
Copy link
Copy Markdown

Let me know when can we connect.

@Govindggupta
Copy link
Copy Markdown
Author

Govindggupta commented May 7, 2026

@abinash-sketch we can connect today . can i get your linked or x ?
how can connect with you ?

@Govindggupta
Copy link
Copy Markdown
Author

@abinash-sketch any update on this ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants