Open source TTS models Kokoro, Orpheus, and Piper are tested on symbols, abbreviations, and prosody with CER and MOS results.
Mistral launches Voxtral TTS, extending its model family into speech generation and enabling end-to-end voice workflows.
The official website for Google AI Edge Eloquent is hosted on Google’s developer-focused google.dev domain, underscoring that ...
Javascript is required for you to be able to read premium content. Please enable it in your browser settings.
Cohere has released Transcribe, a 2-billion-parameter open-source speech recognition model that tops the Hugging Face Open ...
Mistral AI is expanding its Voxtral model family with its first text-to-speech model. The launch comes amid intensifying competition in the fast-growing AI voice market, with Voxtral TTS pitched as an ...
French AI company Mistral released a new open source text-to-speech model on Thursday that can be used by voice AI assistants or in enterprise use cases like customer support. The model, which lets ...
Small and fast: only 123M parameters. High-quality voice cloning: state-of-the-art performance in speaker similarity, intelligibility, and naturalness. Multi-lingual: support Chinese and English.
Abstract: Speech synthesis, the technology that converts text into spoken words, has advanced significantly for high-resource languages like English, Spanish, and Mandarin. However, many languages ...
#3 Winner of Best Use of Zoom API at Stanford TreeHacks 2025! An AI-powered meeting assistant that captures video, audio and textual context from Zoom calls using multimodal RAG. WhisperVoice is a ...