Open source TTS models Kokoro, Orpheus, and Piper are tested on symbols, abbreviations, and prosody with CER and MOS results.
Abstract: In the era of free speech and rapid internet expansion, curbing the dissemination of offensive content on social media has become a pressing concern for linguists and regulatory bodies. Hate ...
This repo is a minimalist and extensible framework for benchmarking various aspects of different text-to-speech (TTS) engines. This benchmark simulates user - voice-assistant interactions, by ...
Abstract: 3D printing is a revolutionary technology that enables the creation of physical objects from digital models. However, the quality and accuracy of 3D printing depend on the correctness and ...
French AI company Mistral released a new open source text-to-speech model on Thursday that can be used by voice AI assistants or in enterprise use cases like customer support. The model, which lets ...
This paper introduces VALL-E 2, the latest advancement in neural codec language models that marks a milestone in zero-shot text-to-speech synthesis (TTS), achieving human parity for the first time.
Small and fast: only 123M parameters. High-quality voice cloning: state-of-the-art performance in speaker similarity, intelligibility, and naturalness. Multi-lingual: support Chinese and English.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results