Skip to content

severilov/DL-Audio-Course

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

logo

Deep Learning for Audio Course, Fall 2024

⭐⭐⭐ Go here for more actual materials (2025-2026): https://github.com/severilov/DL-Audio-AIMasters-Course ⭐⭐⭐

Description

Topics discussed in course:

  • Digital Signal Processing
  • Automatic Speech Recognition (ASR)
  • Key-word spotting (KWS)
  • Text-to-Speech (TTS)
  • Voice Conversion
  • Unsupervised learning in Audio
  • Music Generation with NNs

Course materials

Materials

# Date Description Slides Video
1 September, 12 Lecture 1: Introduction and Digital Signal Processing slides video
2 September, 19 Seminar 1: Introduction, Spectrograms and Griffin-Lim notebook video
3 September, 30 Lecture 2: Automatic Speech Recognition 1: WER, CTC, LAS, Beam Search slides video
4 October, 3 Seminar 2: CTC, Beam Search notebook video
5 October, 10 Lecture 3: Automatic Speech Recognition 2: RNN-T, Conformer, Whisper, Language models in ASR, BPE slides video
6 October, 17 Lecture 4: Key-word spotting (KWS) slides video
6 October, 24 Seminar 3: CTC, Beam Search notebook video
8 October, 31 Lecture 5: Text-to-speech: Tacotron, FastSpeech, Guided Attention slides video
9 November, 7 Seminar 4: Key-word spotting notebook video
10 November, 14 Seminar 5: Text-to-speech: Tacotron2 notebook video
11 November, 21 Lecture 6: Text-to-speech: Neural Vocoders (WaveNet, PWGAN, DiffWave) slides video
12 November, 28 Lecture 7: Self-supervised learning in Audio slides video

Homeworks

Homework Date Deadline Description Link
1 September, 30 October, 14
  1. Audio classification
  2. Audio preprocessing
Open In Github
2 September, 30 October, 28 ASR-1: CTC Open In Github
3 November, 11 November, 25 ASR-2: RNN-T Open In Github
4 November, 11 December, 9 Text-to-speech: FastPitch Open In Github

Game rules

  • 4 homeworks each of 2 points = 8 points
  • final test = 2 points
  • maximum points: 8 + 2 = 10 points

Contributors & course staff

Author + Lectures: Pavel Severilov

Seminars: Viacheslav Shokorov

Help build course materials and held seminars Daniel Knyazev

About

Deep Learning Audio Course, 2024

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages