1. SE ICASSP
    The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
    Shilong Wu, Chenxi Wang, Hang Chen, Yusheng Dai, Chenyue Zhang, Ruoyu Wang, Hongbo Lan, Jun Du, Chin-hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Zhong-Qiu Wang, Jia Pan, and Jianqing Gao
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  2. Audio ICASSP
    Improving Continual Learning of Acoustic Scene Classification via Mutual Information Optimization
    Muqiao Yang, Umberto Cappellazzo, Xiang Li, Shinji Watanabe, and Bhiksha Raj
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  3. ASR ICASSP
    Improving ASR Contextual Biasing with Guided Attention
    Jiyang Tang, Kwangyoun Kim, Suwon Shon, Felix Wu, Prashant Sridhar, and Shinji Watanabe
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  4. SLU ICASSP
    AugSumm: Towards Generalizable Speech Summarization Using Synthetic Labels from Large Language Models
    Jee-weon Jung, Roshan Sharma, William Chen, Bhiksha Raj, and Shinji Watanabe
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  5. ASR&TTS ICASSP
    Voxtlm: Unified Decoder-Only Models for Consolidating Speech Recognition, Synthesis and Speech, Text Continuation Tasks
    Soumi Maiti, Yifan Peng, Shukjae Choi, Jee-weon Jung, Xuankai Chang, and Shinji Watanabe
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  6. ASR ICASSP
    Speech Collage: Code-Switched Audio Generation by Collaging Monolingual Corpora
    Amir Hussein, Dorsa Zeinali, Ondřej Klejch, Matthew Wiesner, Brian Yan, Shammur Chowdhury, Ahmed Ali, Shinji Watanabe, and Sanjeev Khudanpur
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  7. ST ICASSP
    Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization
    Amir Hussein, Brian Yan, Antonios Anastasopoulos, Shinji Watanabe, and Sanjeev Khudanpur
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  8. ASR ICASSP
    Phisanet: Phonetically Informed Speech Animation Network
    Salvador Medina, Sarah Taylor, Carsten Stoll, Gareth Edwards, Alex Hauptmann, Shinji Watanabe, and Iain Matthews
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  9. SD&ASR ICASSP
    One Model to Rule Them All? Towards End-to-End Joint Speaker Diarization and Speech Recognition
    Samuele Cornell, Jee-weon Jung, Shinji Watanabe, and Stefano Squartini
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  10. ASR ICASSP
    Less Peaky and More Accurate CTC Forced Alignment by Pruned CTC Loss and Label Priors
    Ruizhe Huang, Xiaohui Zhang, Zhaoheng Ni, Li Sun, Moto Hira, Jeff Hwang, Vimal Manohar, Vineel Pratap, Shinji Watanabe, Daniel Povey, and Sanjeev Khudanpur
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  11. SSL ICASSP
    HuberTopic: Enhancing Semantic Representation of Hubert Through Self-Supervision Utilizing Topic Model
    Takashi Maekaku, Jiatong Shi, Xuankai Chang, Yuya Fujita, and Shinji Watanabe
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  12. ASR&ST&SLU ICASSP
    Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
    Xuankai Chang, Brian Yan, Kwanghee Choi, Jee-weon Jung, Yichen Lu, Soumi Maiti, Roshan Sharma, Jiatong Shi, Jinchuan Tian, Shinji Watanabe, Yuya Fujita, Takashi Maekaku, Pengcheng Guo, Yao-Fei Cheng, Pavel Denisov, Kohei Saijo, and Hsiu-Hsuan Wang
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  13. LLM&SLU ICASSP
    Dynamic-Superb: Towards a Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
    Chien-yu Huang, Ke-Han Lu, Shih-Heng Wang, Chun-Yi Kuan, Chi-Yuan Hsiao, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng, Roshan Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, and Hung-yi Lee
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  14. ST ICASSP
    Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing
    Brian Yan, Xuankai Chang, Antonios Anastasopoulos, Yuya Fujita, and Shinji Watanabe
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  15. ASR ICASSP
    Semi-Autoregressive Streaming ASR with Label Context
    Siddhant Arora, George Saon, Shinji Watanabe, and Brian Kingsbury
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  16. SSL ICASSP
    Generative Context-Aware Fine-Tuning of Self-Supervised Speech Models
    Suwon Shon, Kwangyoun Kim, Prashant Sridhar, Yi-Te Hsu, Shinji Watanabe, and Karen Livescu
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  17. ASR ICASSP
    Contextualized Automatic Speech Recognition with Attention-Based Bias Phrase Boosted Beam Search
    Yui Sudo, Shakeel Muhammad, Yosuke Fukumoto, Yifan Peng, and Shinji Watanabe
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  18. SSL ICASSP
    Train Long and Test Long: Leveraging Full Document Contexts in Speech Processing
    William Chen, Takatomo Kano, Atsunori Ogawa, Marc Delcroix, and Shinji Watanabe
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  19. SE ICASSP
    Improving Design of Input Condition Invariant Speech Enhancement
    Wangyou Zhang, Jee-weon Jung, Shinji Watanabe, and Yanmin Qian
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  20. ASR ICASSP
    Phoneme-Aware Encoding for Prefix-Tree-Based Contextual ASR
    Hayato Futami, Emiru Tsunoo, Yosuke Kashiwagi, Hiroaki Ogawa, Siddhant Arora, and Shinji Watanabe
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  21. SS ICASSP
    Boosting Unknown-Number Speaker Separation with Transformer Decoder-Based Attractor
    Younglo Lee, Shukjae Choi, Byeong-Yeol Kim, Zhong-Qiu Wang, and Shinji Watanabe
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  22. ASR ICASSP
    Visual Speech Recognition for Low-Resource Languages with Automatic Labels from Whisper Model
    Jeong Hun Yeo, Minsu Kim, Shinji Watanabe, and Yong Man Ro
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  23. Caption ICASSP
    Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-Training and Multi-Modal Tokens
    Minsu Kim, Jeongsoo Choi, Soumi Maiti, Jeong Hun Yeo, Shinji Watanabe, and Yong Man Ro
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  24. SSL ICASSP
    Understanding Probe Behaviors Through Variational Bounds of Mutual Information
    Kwanghee Choi, Jee-weon Jung, and Shinji Watanabe
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  25. Caption ICASSP
    Improving Audio Captioning Models with Fine-Grained Audio Features, Text Embedding Supervision, and LLM Mix-Up Augmentation
    Shih-Lun Wu, Xuankai Chang, Gordon Wichern, Jee-weon Jung, François Germain, Jonathan Le Roux, and Shinji Watanabe
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  26. SSL ICASSP
    AV-Superb: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models
    Yuan Tseng, Layne Berry, Yi-Ting Chen, I-Hsiang Chiu, Hsuan-Hao Lin, Max Liu, Puyuan Peng, Yi-Jen Shih, Hung-Yu Wang, Haibin Wu, Po-Yao Huang, Chun-Mao Lai, Shang-Wen Li, David Harwath, Yu Tsao, Shinji Watanabe, Abdelrahman Mohamed, Chi Luen Feng, and Hung-yi Lee
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024