Speech Recognition and Understanding (11-751/18-781)
Course Logistics
- Instructor: Shinji Watanabe
- TAs: Xuankai Chang, Yifan Peng, Brian Yan
- Time: MW 3:30PM – 4:50PM
- Location: GHC 4307
- Discussion: Piazza
Grading
- Grading policies
- Class Participation (25%)
- Assignments (30%)
- Mid-term exam (20%)
- Term Project (25%)
- We will use gradescope
Syllabus
- This is a tentative schedule.
- The slides will be uploaded right before the lecture.
- The vidoes will be uploaded irregulaly after the lecture due to the edit process.
Date | Lecture | Topics | Slides/Videos |
---|---|---|---|
8/28 | Course overview | Course explanation and introduction | |
8/30 | Introduction of speech recognition |
- Evaluation metric - How to transcribe speech - Databases |
|
9/6 | Speech recognition formulations |
- Probabilistic rules - From Bayes decision theory to HMM + n-gram, CTC, RNN-T, and attention |
|
9/11 | Feature extraction |
- Basic pipeline - Some advances in feature extractions |
|
9/13 | Acoustic model overview | ||
9/18 | Alignment problems |
- 3 state left-to-right HMM - CTC - Transducer |
|
9/20 | K-means, GMM, EM algorithm | ||
9/25 | Forward-backward algorithm for HMM | ||
9/27 | Forward-backward algorithm for HMM | ||
10/2 | Forward-backward algorithm for CTC and Viterbi algorithm | ||
10/2 | N-gram language modelm | ||
10/9 | Midterm exam | ||
10/11 | Search |
- Time-synchronous beam search - Label-synchronous beam search - N-best and lattice - Rescoring |
|
10/23 | ESPnet hands-on tutorial I |
- Introduction of toolkit - How to make a new recipe |
|
10/25 | ESPnet hands-on tutorial II | - How to make a new task | |
10/30 | Deep neural network for acoustic modeling | ||
11/1 | Neural network language model | ||
11/6 | End-to-End ASR: Attention | ||
11/8 | End-to-End ASR: CTC | ||
11/13 | End-to-End ASR: RNN-T | ||
11/15 | Advanced topics on end-to-end ASR I | ||
11/20 | Advanced topics on end-to-end ASR II | ||
11/27 | Guest Lecture | ||
11/29 | Guest Lecture | ||
12/4 | Project Event | ||
12/6 | Project Event |
Assignments
Will be announced during the course