1. TTS ICASSP
    Espnet-TTS: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit
    Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, and Xu Tan
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
  2. ST ACL
    ESPnet-ST: All-in-One Speech Translation Toolkit
    Hirofumi Inaguma, Shun Kiyono, Kevin Duh, Shigeki Karita, Nelson Yalta, Tomoki Hayashi, and Shinji Watanabe
    In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2020
  3. SR&SSL NeurIPS
    Augmentation adversarial training for self-supervised speaker recognition
    Jaesung Huh, Hee Soo Heo, Jingu Kang, Shinji Watanabe, and Joon Son Chung
    2020
  4. SED DCASE
    Conformer-based sound event detection with semi-supervised learning and data augmentation
    Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, and Kazuya Takeda
    2020
  5. ASR CHiME
    The JHU multi-microphone multi-speaker ASR system for the CHiME-6 challenge
    Ashish Arora, Desh Raj, Aswin Shanmugam Subramanian, Ke Li, Bar Ben-Yair, Matthew Maciejewski, Piotr Żelasko, Paola Garcia, Shinji Watanabe, and Sanjeev Khudanpur
    2020
  6. ASR ICASSP
    End-to-end multi-speaker speech recognition with transformer
    Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, and Shinji Watanabe
    In 2020
  7. TTS ICASSP
    Semi-supervised speaker adaptation for end-to-end speech synthesis with pretrained models
    Katsuki Inoue, Sunao Hara, Masanobu Abe, Tomoki Hayashi, Ryuichi Yamamoto, and Shinji Watanabe
    In 2020
  8. ASR ICASSP
    End-to-end automatic speech recognition integrated with ctc-based voice activity detection
    Takenori Yoshimura, Tomoki Hayashi, Kazuya Takeda, and Shinji Watanabe
    In 2020
  9. ASR ICASSP
    Attention-based asr with lightweight and dynamic convolutions
    Yuya Fujita, Aswin Shanmugam Subramanian, Motoi Omachi, and Shinji Watanabe
    In 2020
  10. ASR ICASSP
    A practical two-stage training strategy for multi-stream end-to-end speech recognition
    Ruizhi Li, Gregory Sell, Xiaofei Wang, Shinji Watanabe, and Hynek Hermansky
    In 2020
  11. SD ICASSP
    Speaker diarization with region proposal network
    Zili Huang, Shinji Watanabe, Yusuke Fujita, Paola Garcı́a, Yiwen Shao, Daniel Povey, and Sanjeev Khudanpur
    In 2020
  12. SED ICASSP
    Weakly-supervised sound event detection with self-attention
    Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, and Kazuya Takeda
    In 2020
  13. SE ICASSP
    Far-field location guided target speech extraction using end-to-end speech recognition objectives
    Aswin Shanmugam Subramanian, Chao Weng, Meng Yu, Shi-Xiong Zhang, Yong Xu, Shinji Watanabe, and Dong Yu
    In 2020
  14. ASR Deep Neural Evolution
    Automated Development of DNN Based Spoken Language Systems Using Evolutionary Algorithms
    Takahiro Shinozaki, Shinji Watanabe, and Kevin Duh
    2020
  15. ASR&TTS VCC
    The sequence-to-sequence baseline for the voice conversion challenge 2020: Cascading asr and tts
    Wen-Chin Huang, Tomoki Hayashi, Shinji Watanabe, and Tomoki Toda
    2020
  16. SE&ASR NeurIPS
    Sequence to multi-sequence learning via conditional chain mapping for mixture signals
    Jing Shi, Xuankai Chang, Pengcheng Guo, Shinji Watanabe, Yusuke Fujita, Jiaming Xu, Bo Xu, and Lei Xie
    2020
  17. ASR Interspeech
    End-to-End ASR with Adaptive Span Self-Attention.
    Xuankai Chang, Aswin Shanmugam Subramanian, Pengcheng Guo, Shinji Watanabe, Yuya Fujita, and Motoi Omachi
    In 2020
  18. TTS Interspeech
    Learning speaker embedding from text-to-speech
    Jaejin Cho, Piotr Zelasko, Jesús Villalba, Shinji Watanabe, and Najim Dehak
    2020
  19. SE Interspeech
    Speaker-conditional chain model for speech separation and extraction
    Jing Shi, Jiaming Xu, Yusuke Fujita, Shinji Watanabe, and Bo Xu
    2020
  20. ASR Interspeech
    Insertion-based modeling for end-to-end automatic speech recognition
    Yuya Fujita, Shinji Watanabe, Motoi Omachi, and Xuankai Chan
    2020
  21. SD Interspeech
    End-to-end speaker diarization for an unknown number of speakers with encoder-decoder based attractors
    Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, and Kenji Nagamatsu
    2020
  22. ASR Interspeech
    End-to-end far-field speech recognition with unified dereverberation and beamforming
    Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Shinji Watanabe, and Yanmin Qian
    2020
  23. ASR Interspeech
    Mask CTC: Non-autoregressive end-to-end ASR with CTC and mask predict
    Yosuke Higuchi, Shinji Watanabe, Nanxin Chen, Tetsuji Ogawa, and Tetsunori Kobayashi
    2020