Investigating Zero-shot Generalizability On Mandarin-english Code-switched ASR And Speech-to-text Translation Of Recent Foundation Models With Self-supervision And Weak Supervision · The Large Language Model Bible Contribute to LLM-Bible

Investigating Zero-shot Generalizability On Mandarin-english Code-switched ASR And Speech-to-text Translation Of Recent Foundation Models With Self-supervision And Weak Supervision

Yang Chih-kai, Huang Kuan-po, Lu Ke-han, Kuan Chun-yi, Hsiao Chi-yuan, Lee Hung-yi. Arxiv 2023

[Paper]    
Training Techniques

This work evaluated several cutting-edge large-scale foundation models based on self-supervision or weak supervision, including SeamlessM4T, SeamlessM4T v2, and Whisper-large-v3, on three code-switched corpora. We found that self-supervised models can achieve performances close to the supervised model, indicating the effectiveness of multilingual self-supervised pre-training. We also observed that these models still have room for improvement as they kept making similar mistakes and had unsatisfactory performances on modeling intra-sentential code-switching. In addition, the validity of several variants of Whisper was explored, and we concluded that they remained effective in a code-switching scenario, and similar techniques for self-supervised models are worth studying to boost the performance of code-switched tasks.

Similar Work