News

Code-switching arises when a (typically multilingual) speaker changes language during an utterance. This linguistic phenomenon causes problems for automatic speech recognition as the models are ...
This paper proposes a novel collaborative dysarthric speech recognition system designed to convert dysarthric speech into non-dysarthric speech to enhance the robustness of automatic speech ...
Text-to-Speech for over 7000 Languages IMS Toucan is a toolkit for training, using, and teaching state-of-the-art Text-to-Speech Synthesis, developed at the Institute for Natural Language Processing ...
Stream-Omni enables seamless interactions across text, vision, and speech using a large language model. This repository includes the model, datasets, and tools for developers to explore multimodal ...