2022 Reading Group
2021.12.21 ASRU 2021 Paper List
- Data Augmentation for ASR Using TTS via A Discrete Representation
- Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition
2022.01.11 ASRU 2021 Paper List
- Improving HS-DACS Based Streaming Transformer ASR with Deep Reinforcement Learning
- Adapting GPT, GPT-2 and Bert Language Models for Speech Recognition
- TS-RIR: Translated Synthetic Room Impulse Responses for Speech Augmentation
2022.01.19 ASRU 2021 Paper List
- Unsupervised Domain Adaptation Schemes for Building ASR in Low-Resource Languages
- Relaxed Attention: A Simple Method to Boost Performance of End-To-End Automatic Speech Recognition
2022.02.02 ASRU 2021 Paper List
- Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition
- Comparing the Benefit of Synthetic Training Data for Various Automatic Speech Recognition Architectures
- Joint prediction of truecasing and punctuation for conversational speech in low-resource scenarios
2022.02.16 Survey of Streaming SLU (presented by Siddhant Arora)
2022.02.23 NeurIPS 2021 Paper List
- Unsupervised Speech Recognition
- A Universal Law of Robustness via Isoperimetry
- Speech-T: Transducer for Text to Speech and Beyond
2022.03.02 NeurIPS 2021 Paper List
- Multimodal and Multilingual Embeddings for Large-Scale Speech Mining
- Pay Attention to MLPs
- Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network
2022.03.30 NeurIPS 2021 Paper List
- FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
- Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations
- Unsupervised Noise Adaptive Speech Enhancement by Discriminator-Constrained Optimal Transport
2022.04.06 Adapters in Speech Transformers (presented by Karthik Ganesan)
2022.04.13 NeurIPS 2021 Paper List
- Understanding Adaptive, Multiscale Temporal Integration In Deep Speech Recognition Systems
- Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling
- Towards efficient end-to-end speech recognition with biologically-inspired neural networks
2022.04.27 Survey of Semi-Supervised ASR (presented by Dan Berrebbi)
2022.05.04 The VoiceMOS Challenge 2022 (presented by Wen-Chin Huang from Nagoya University)
- The VoiceMOS Challenge 2022
- CodaLab Challenge Page
- Paper
- Baseline System 1
- Baseline System 2
- Baseline System 3
2022.06.01 ICASSP 2022 Paper List
- VarArray: Array-Geometry-Agnostic Continuous Speech Separation
- Self Supervised Representation Learning with Deep Clustering for Acoustic Unit Discovery from Raw Speech
- TPARN: Triple-path Attentive Recurrent Network for Time-domain Multichannel Speech Enhancement
2022.06.08 ICASSP 2022 Paper List
- One Model to Enhance Them All: Array Geometry Agnostic Multi-Channel Personalized Speech Enhancement
- Self-supervised Speaker Recognition Training Using Human-Machine Dialogues
- Self-Supervised Learning Method Using Multiple Sampling Strategies for General-Purpose Audio Representation
2022.06.15 ICASSP 2022 Paper List
- Combining Unsupervised and Text Augmented Semi-Supervised Learning for Low Resourced Autoregressive Speech Recognition
- Factorized Neural Transducer for Efficient Language Model Adaptation
- Neufa: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism
- Towards Reducing the Need for Speech Training Data To Build Spoken Language Understanding Systems
2022.07.27 ICML 2022 Paper List
- Revisiting End-to-End Speech-to-Text Translation From Scratch
- Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
- Efficient Representation Learning via Adaptive Context Pooling
- Characterizing and Overcoming the Greedy Nature of Learning in Multi-modal Deep Neural Networks
- GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
2022.08.03 ICML 2022 Paper List
- ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
- A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing
- Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
- Multi Resolution Analysis (MRA) for Approximate Self-Attention
2022.08.10 ICASSP 2022 Paper List
- Consistent Training and Decoding for End-to-End Speech Recognition Using Lattice-Free MMI
- On Language Model Integration for RNN Transducer Based Speech Recognition
- Tight Integration Of Neural- And Clustering-Based Diarization Through Deep Unfolding Of Infinite Gaussian Mixture Model
2022.08.24 ICASSP 2022 Paper List
- Spatial-Temporal Graph Convolution Network for Multichannel Speech Enhancement
- Exploring Machine Speech Chain For Domain Adaptation
- Improving CTC-based speech recognition via knowledge transferring from pre-trained language models
2022.11.3 INTERSPEECH 2022 Paper List
- CTC Variations Through New WFST Topologies
- Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding
- Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition
2022.11.10 INTERSPEECH 2022 Paper List
- Text-Only Domain Adaptation Based on Intermediate CTC
- Distilling a Pretrained Language Model to a Multilingual ASR Model
- Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models
2022.12.1 INTERSPEECH 2022 Paper List
- Deliberation Model for On-Device Spoken Language Understanding
- Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
- Weakly-Supervised Neural Full-Rank Spatial Covariance Analysis for a Front-End System of Distant Speech Recognition