WAVLab | 2022 Papers

TTS AAAI

A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech

Li-Wei Chen, Alexander Rudnicky, and Shinji Watanabe

In Proceedings of AAAI 2022
ASR EMNLP

BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model

Yosuke Higuchi, Brian Yan, Siddhant Arora, Tetsuji Ogawa, Tetsunori Kobayashi, and Shinji Watanabe

In Proceedings of Findings of EMNLP 2022
SLU EMNLP

Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models

Siddhant Arora, Siddharth Dalmia, Brian Yan, Florian Metze, Alan W Black, and Shinji Watanabe

In Proceedings of Findings of EMNLP 2022
SD TASLP

Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors

Shota Horiguchi, Shinji Watanabe, Paola Garcia, Yuki Takashima, and Yohei Kawaguchi

In IEEE/ACM Transactions on Audio, Speech, and Language Processing 2022
SE CSL

A Dilemma of Ground Truth in Noisy Speech Separation and an Approach to Lessen the Impact of Imperfect Training Data

Matthew Maciejewski, Jing Shi, Shinji Watanabe, and Sanjeev Khudanpur

In Computer Speech & Language 2022
SE TASLP

End-to-End Dereverberation, Beamforming, and Speech Recognition in A Cocktail Party

Wangyou Zhang, Xuankai Chang, Christoph Boeddeker, Tomohiro Nakatani, Shinji Watanabe, and Yanmin Qian

In IEEE/ACM Transactions on Audio, Speech, and Language Processing 2022
SE SPL

Improving Frame-Online Neural Speech Enhancement with Overlapped-Frame Prediction

Zhong-Qiu Wang, and Shinji Watanabe

In IEEE Signal Processing Letters 2022
SD TASLP

Encoder-Decoder Based Attractors for End-to-End Neural Diarization

Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, and Paola Garcia

In IEEE/ACM Transactions on Audio, Speech, and Language Processing 2022
ASR JSTSP

Self-Supervised Speech Representation Learning: A Review

Abdelrahman Mohamed, Hung-yi Lee, Lasse Borgholt, Jakob D. Havtorn, Joakim Edin, Christian Igel, Katrin Kirchhoff, Shang-Wen Li, Karen Livescu, Lars Maaløe, Tara N. Sainath, and Shinji Watanabe

In IEEE Journal of Selected Topics in Signal Processing 2022
ST IWSLT

Findings of the IWSLT 2022 Evaluation Campaign

Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, and Shinji Watanabe

In iwsltt 2022
SD&SS SLT

EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers

Yushi Ueda, Soumi Maiti, Shinji Watanabe, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, and Yong Xu

In Proceedings of IEEE Spoken Language Technology Workshop (SLT) 2022
ASR&SD&SLU&ER SLT

SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning

Tzu-hsun Feng, Annie Dong, Ching-Feng Yeh, Shu-wen Yang, Tzu-Quan Lin, Jiatong Shi, Kai-Wei Chang, Zili Huang, Haibin Wu, Xuankai Chang, Shinji Watanabe, Abdel-rahman Mohamed, Shang-Wen Li, and Hung-yi Lee

In Proceedings of IEEE Spoken Language Technology Workshop (SLT) 2022
ASR SLT

E-Branchformer: Branchformer with Enhanced merging for speech recognition

Kwangyoun Kim, Felix Wu, Yifan Peng, Jing Pan, Prashant Sridhar, Kyu Jeong Han, and Shinji Watanabe

In Proceedings of IEEE Spoken Language Technology Workshop (SLT) 2022
ASR&SLU SLT

A Study on the Integration of Pre-Trained SSL and ASR and LM and SLU Models for Spoken Language Understanding

Yifan Peng, Siddhant Arora, Yosuke Higuchi, Yushi Ueda, Sujay Kumar, Karthik Ganesan, Siddharth Dalmia, Xuankai Chang, and Shinji Watanabe

In Proceedings of IEEE Spoken Language Technology Workshop (SLT) 2022
ASR&SSL SLT

On Compressing Sequences for Self-Supervised Speech Models

Yen Meng, Hsuan-Jui Chen, Jiatong Shi, Shinji Watanabe, Paola Garcia, Hung-yi Lee, and Hao Tang

In Proceedings of IEEE Spoken Language Technology Workshop (SLT) 2022
ASR&SE&SSL SLT

End-to-End Integration of Speech Recognition and Dereverberation and Beamforming and Self-Supervised Learning Representation

Yoshiki Masuyama, Xuankai Chang, Samuele Cornell, Shinji Watanabe, and Nobutaka Ono

In Proceedings of IEEE Spoken Language Technology Workshop (SLT) 2022
SE SLT

Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization

Shota Horiguchi, Yuki Takashima, Shinji Watanabe, and Paola Garcia

In Proceedings of IEEE Spoken Language Technology Workshop (SLT) 2022
ASR SLT

End-to-End Multi-speaker ASR with Independent Vector Analysis

Robin Scheibler, Wangyou Zhang, Xuankai Chang, Shinji Watanabe, and Yanmin Qian

In Proceedings of IEEE Spoken Language Technology Workshop (SLT) 2022
ASR Interspeech

VQ-T: RNN Transducers using Vector-Quantized Prediction Network States

Jiatong Shi, George Saon, David Haws, Shinji Watanabe, and Brian Kingsbury

In Proceedings of Interspeech 2022
ASR Interspeech

Memory-Efficient Training of RNN-Transducer with Sampled Softmax

Jaesong Lee, Lukas Lee, and Shinji Watanabe

In Proceedings of Interspeech 2022
SLU&ST Interspeech

Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation

Keqi Deng, Shinji Watanabe, Jiatong Shi, and Siddhant Arora

In Proceedings of Interspeech 2022
Music Interspeech

SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy

Shuai Guo, Jiatong Shi, Tao Qian, Shinji Watanabe, and Qin Jin

In Proceedings of Interspeech 2022
Music Interspeech

Muskits: an End-to-end Music Processing Toolkit for Singing Voice Synthesis

Jiatong Shi, Shuai Guo, Tao Qian, Tomoki Hayashi, Yuning Wu, Fangzheng Xu, Xuankai Chang, Huazhe Li, Peter Wu, Shinji Watanabe, and Qin Jin

In Proceedings of Interspeech 2022
ASR Interspeech

Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis

Hang Chen, Jun Du, Yusheng Dai, Chin-Hui Lee, Sabato Marco Siniscalchi, Shinji Watanabe, Odette Scharenborg, Jingdong Chen, Baocai Yin, and Jia Pan

In Proceedings of Interspeech 2022
KWS Interspeech

Audio-Visual Wake Word Spotting in MISP2021 Challenge: Dataset Release and Deep Analysis

Hengshun Zhou, Jun Du, Gongzhen Zou, Zhaoxu Nian, Chin-Hui Lee, Sabato Marco Siniscalchi, Shinji Watanabe, Odette Scharenborg, Jingdong Chen, Shifu Xiong, and Jian-Qing Gao

In Proceedings of Interspeech 2022
ASR Interspeech

ASR2K: Speech Recognition for Around 2000 Languages without Audio

Xinjian Li, Florian Metze, David R. Mortensen, Alan W Black, and Shinji Watanabe

In Proceedings of Interspeech 2022
SE Interspeech

ESPnet-SE++: Speech Enhancement for Robust Speech Recognition and Translation and and Understanding

Yen-Ju Lu, Xuankai Chang, Chenda Li, Wangyou Zhang, Samuele Cornell, Zhaoheng Ni, Yoshiki Masuyama, Brian Yan, Robin Scheibler, Zhong-Qiu Wang, Yu Tsao, Yanmin Qian, and Shinji Watanabe

In Proceedings of Interspeech 2022
SLU Interspeech

Two-Pass Low Latency End-to-End Spoken Language Understanding

Siddhant Arora, Siddharth Dalmia, Xuankai Chang, Brian Yan, Alan W Black, and Shinji Watanabe

In Proceedings of Interspeech 2022
TTS Interspeech

Deep Speech Synthesis from Articulatory Representations

Peter Wu, Shinji Watanabe, Louis Goldstein, Alan W Black, and Gopala Krishna Anumanchipalli

In Proceedings of Interspeech 2022
ASR Interspeech

Minimum latency training of sequence transducers for streaming end-to-end speech recognition

Yusuke Shinohara, and Shinji Watanabe

In Proceedings of Interspeech 2022
ASR Interspeech

Streaming Automatic Speech Recognition with Re-blocking Processing Based on Integrated Voice Activity Detection

Yui Sudo, Shakeel Muhammad, Kazuhiro Nakadai, Jiatong Shi, and Shinji Watanabe

In Proceedings of Interspeech 2022
ASR Interspeech

Better Intermediates Improve CTC Inference

Tatsuya Komatsu, Yusuke Fujita, Jaesong Lee, Lukas Lee, Shinji Watanabe, and Yusuke Kida

In Proceedings of Interspeech 2022
ASR Interspeech

Updating Only Encoders Prevents Catastrophic Forgetting of End-to-End ASR Models

Yuki Takashima, Shota Horiguchi, Shinji Watanabe, Leibny Paola Garcia Perera, and Yohei Kawaguchi

In Proceedings of Interspeech 2022
ASR Interspeech

Attention Weight Smoothing Using Prior Distributions for Transformer-Based End-to-End ASR

Takashi Maekaku, Yuya Fujita, Yifan Peng, and Shinji Watanabe

In Proceedings of Interspeech 2022
ASR Interspeech

Residual Language Model for End-to-end Speech Recognition

Emiru Tsunoo, Yosuke Kashiwagi, Chaitanya Prasad Narisetty, and Shinji Watanabe

In Proceedings of Interspeech 2022
TTS Interspeech

When Is TTS Augmentation Through a Pivot Language Useful?

Nathaniel Romney Robinson, Perez Ogayo, Swetha R. Gangu, David R. Mortensen, and Shinji Watanabe

In Proceedings of Interspeech 2022
TTS Interspeech

TriniTTS: Pitch-controllable End-to-end TTS without External Aligner

Yooncheol Ju, Ilhwan Kim, Hongsun Yang, Ji-Hoon Kim, Byeongyeol Kim, Soumi Maiti, and Shinji Watanabe

In Proceedings of Interspeech 2022
ASR Interspeech

Online Continual Learning of End-to-End Speech Recognition Models

Muqiao Yang, Ian Lane, and Shinji Watanabe

In Proceedings of Interspeech 2022
SE Interspeech

Improving Speech Enhancement through Fine-Grained Speech Characteristics

Muqiao Yang, Joseph Konan, David Bick, Anurag Kumar, Shinji Watanabe, and Bhiksha Raj

In Proceedings of Interspeech 2022
ASR&SE&SSL Interspeech

End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation

Xuankai Chang, Takashi Maekaku, Yuya Fujita, and Shinji Watanabe

In Proceedings of Interspeech 2022
ASR&SSL Interspeech

Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation

Dan Berrebbi, Jiatong Shi, Brian Yan, Osbel López-Francisco, Jonathan Amith, and Shinji Watanabe

In Proceedings of Interspeech 2022
ASR&SLU&MT ICML

Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding

Yifan Peng, Siddharth Dalmia, Ian Lane, and Shinji Watanabe

In Proceedings of the International Conference on Machine Learning (ICML) 2022
Linguistic ACL

Zero-shot Learning for Grapheme to Phoneme Conversion with Language Ensemble

Xinjian Li, Florian Metze, David R Mortensen, Shinji Watanabe, and Alan Black

In Proceedings of Findings of the Annual Meeting of the Association for Computational Linguistics 2022
SE&VC&ST ACL

SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities

Hsiang-Sheng Tsai, Heng-Jui Chang, Wen-Chin Huang, Zili Huang, Kushal Lakhotia, Shu-wen Yang, Shuyan Dong, Andy T. Liu, Cheng-I Lai, Jiatong Shi, Xuankai Chang, Phil Hall, Hsuan-Jui Chen, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, and Hung-yi Lee

In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2022
SE&ASR CSL

Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition

Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, and Dong Yu

Computer Speech & Language 2022
SD CSL

A review of speaker diarization: Recent advances with deep learning

Tae Jin Park, Naoyuki Kanda, Dimitrios Dimitriadis, Kyu J Han, Shinji Watanabe, and Shrikanth Narayanan

Computer Speech & Language 2022
SE&ASR CSL

Joint speaker diarization and speech recognition based on region proposal networks

Zili Huang, Marc Delcroix, Leibny Paola Garcia, Shinji Watanabe, Desh Raj, and Sanjeev Khudanpur

Computer Speech & Language 2022
ASR CSL

Arabic speech recognition by end-to-end, modular systems and human

Amir Hussein, Shinji Watanabe, and Ahmed Ali

Computer Speech & Language 2022
ASR ICASSP

TOWARDS LOW-DISTORTION MULTI-CHANNEL SPEECH ENHANCEMENT: THE ESPNET-SE SUBMISSION TO THE L3DAS22 CHALLENGE

Jen-Ju Lu, Samuele Cornell, Xuankai Chang, Wangyou Zhang, Chenda Li, Zhaoheng Ni, Zhong-Qiu Wang, and Shinji Watanabe

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
Multimodal ICASSP

THE FIRST MULTIMODAL INFORMATION BASED SPEECH PROCESSING (MISP) CHALLENGE: DATA, TASKS, BASELINES AND RESULTS

Hang Chen, Hengshun Zhou, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Di-Yuan Liu, Bao-Cai Yin, Jia Pan, Jian-Qing Gao, and Cong Liu

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
ASR ICASSP

NON-AUTOREGRESSIVE END-TO-END AUTOMATIC SPEECH RECOGNITION INCORPORATING DOWNSTREAM NATURAL LANGUAGE PROCESSING

Motoi Omachi, Yuya Fujita, Shinji Watanabe, and Tianzi Wang

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
ASR ICASSP

AN EXPLORATION OF HUBERT WITH LARGE NUMBER OF CLUSTER UNITS AND MODEL ASSESSMENT USING BAYESIAN INFORMATION CRITERION

Takashi Maekaku, Xuankai Chang, Yuya Fujita, and Shinji Watanabe

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
SE&SSL ICASSP

INVESTIGATING SELF-SUPERVISED LEARNING FOR SPEECH ENHANCEMENT AND SEPARATION

Zili Huang, Shinji Watanabe, Shu-wen Yang, Paola Garcia, and Sanjeev Khudanpur

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
SE ICASSP

CONDITIONAL DIFFUSION PROBABILISTIC MODEL FOR SPEECH ENHANCEMENT

Yen-Ju Lu, Zhong-Qiu Wang, Shinji Watanabe, Alexander Richard, Cheng Yu, and Yu Tsao

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
ASR ICASSP

IMPROVING NON-AUTOREGRESSIVE END-TO-END SPEECH RECOGNITION WITH PRE-TRAINED ACOUSTIC AND LANGUAGE MODELS

Keqi Deng, Zehui Yang, Shinji Watanabe, Yosuke Higuchi, Gaofeng Cheng, and Pengyuan Zhang

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
ASR ICASSP

SRU++: PIONEERING FAST RECURRENCE WITH ATTENTION FOR SPEECH RECOGNITION

Jing Pan, Tao Lei, Kwangyoun Kim, Kyu Han, and Shinji Watanabe

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
ASR ICASSP

Integrating multiple ASR systems into NLP backend with attention fusion

Takatomo Kano, Atsunori Ogawa, Marc Delcroix, and Shinji Watanabe

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
SLU ICASSP

ESPNET-SLU: ADVANCING SPOKEN LANGUAGE UNDERSTANDING THROUGH ESPNET

Siddhant Arora, Siddharth Dalmia, Pavel Denisov, Xuankai Chang, Yushi Ueda, Yifan Peng, Yuekai Zhang, Sujay Kumar, Karthik Ganesan, Brian Yan, Ngoc Thang Vu, Alan W Black, and Shinji Watanabe

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
ASR ICASSP

JOINT MODELING OF CODE-SWITCHED AND MONOLINGUAL ASR VIA CONDITIONAL FACTORIZATION

Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, and Dong Yu

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
ASR ICASSP

EXTENDED GRAPH TEMPORAL CLASSIFICATION FOR MULTI-SPEAKER END-TO-END ASR

Xuankai Chang, Niko Moritz, Takaaki Hori, Shinji Watanabe, and Jonathan Le Roux

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
ASR ICASSP

Sequence Transduction with Graph-based Supervision

Niko Moritz, Takaaki Hori, Shinji Watanabe, and Jonathan Le Roux

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
ASR ICASSP

RUN-AND-BACK STITCH SEARCH: NOVEL BLOCK SYNCHRONOUS DECODING FOR STREAMING ENCODER-DECODER ASR

Emiru Tsunoo, Chaitanya Narisetty, Michael Hentschel, Yosuke Kashiwagi, and Shinji Watanabe

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
VC&SSL ICASSP

S3PRL-VC: OPEN-SOURCE VOICE CONVERSION FRAMEWORK WITH SELF-SUPERVISED SPEECH REPRESENTATIONS

Wen-Chin Huang, Shu-wen Yang, Tomoki Hayashi, Hung-yi Lee, Shinji Watanabe, and Tomoki Toda

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
ASR ICASSP

JOINT SPEECH RECOGNITION AND AUDIO CAPTIONING

Chaitanya Narisetty, Emiru Tsunoo, Xuankai Chang, Yosuke Kashiwagi, Michael Hentschel, and Shinji Watanabe

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
SD ICASSP

MULTI-CHANNEL END-TO-END NEURAL DIARIZATION WITH DISTRIBUTED MICROPHONES

Shota Horiguchi, Yuki Takashima, Paola Garcia, Shinji Watanabe, and Yohei Kawaguchi

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
ASR ICASSP

TORCHAUDIO: BUILDING BLOCKS FOR AUDIO AND SPEECH PROCESSING

Yao-Yuan Yang, Moto Hira, Zhaoheng Ni, Artyom Astafurov, Caroline Chen, Christian Puhrsch, David Pollack, Dmitriy Genzel, Donny Greenberg, Edward Yang, Jason Lian, Jeff Hwang, Ji Chen, Peter Goldsborough, Sean Narenthiran, Shinji Watanabe, Soumith Chintala, and Vincent Quenneville-Bélair

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
SD ICASSP

Towards End-to-End Speaker Diarization with Generalized Neural Speaker Clustering

Chunlei Zhang, Jiatong Shi, Chao Weng, Meng Yu, and Dong Yu

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
Music ICASSP

TRAINING STRATEGIES FOR AUTOMATIC SONG WRITING: A UNIFIED FRAMEWORK PERSPECTIVE

Tao Qian, Jiatong Shi, Shuai Guo, Peter Wu, and Qin Jin

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
SE+ASR CSL

An investigation of neural uncertainty estimation for target speaker extraction equipped RNN transducer

Jiatong Shi, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, and Dong Yu

Computer Speech & Language 2022

Abs

Target-speaker speech recognition aims to recognize the speech of an enrolled speaker from an environment with background noise and interfering speakers. This study presents a joint framework that combines time-domain target speaker extraction and recurrent neural network transducer (RNN-T) for speech recognition. To alleviate the adverse effects of residual noise and artifacts introduced by the target speaker extraction module to the speech recognition back-end, we explore to training the target speaker extraction and RNN-T jointly. We find a multi-stage training strategy that pre-trains and fine-tunes each module before joint training is crucial in stabilizing the training process. In addition, we propose a novel neural uncertainty estimation that leverages useful information from the target speaker extraction module to further improve the back-end speech recognizer (i.e., speaker identity uncertainty and speech enhancement uncertainty). Compared to a recognizer with target speech extraction front-end, our experiments show that joint-training and the neural uncertainty module reduce 7% and 17% relative character error rate (CER) on multi-talker simulation data, respectively. The multi-condition experiments indicate that our method can reduce 9% relative CER in the noisy condition without losing performance in the clean condition. We also observe consistent improvements in further evaluation of real-world data based on vehicular speech.
SE ICASSP

Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge

Yen-Ju Lu, Samuele Cornell, Xuankai Chang, Wangyou Zhang, Chenda Li, Zhaoheng Ni, Zhong-Qiu Wang, and Shinji Watanabe

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022