Skip to content

soy2oon/2026_Spring_DSL_Modeling_CV

 
 

Repository files navigation

Real-time Presentation/Interview Coach via Computer Vision and Speech Language Models(Webcam + Speech)

Structure

  • src/multimodal_coach: core package
  • apps/run_multimodal_coach.py: unified webcam runner
  • assets/: reference video/audio/subtitles/derived pose data
  • experiments/: legacy pilots and gesture FSL project
  • tests/: unit tests

Run

PYTHONPATH=src python apps/run_multimodal_coach.py

API (speech feedback)

PYTHONPATH=src uvicorn multimodal_coach.api.feedback_server:app --reload

About

Real-time interview & presentation coach powered by MediaPipe (478-point face mesh + pose estimation) and Whisper STT. Tracks eye contact, expression, posture, speech pace, filler words (Korean/English), and prosody — delivers live nudges and post-session LLM-generated reports.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 98.6%
  • Python 1.4%