Delete search term

Header

Main navigation

Voice Adaptation in Your Dialect – CAI Clones Voices for SRF Einstein

Researchers at the Center for Artificial Intelligence (CAI) have developed a voice adaptation system capable of mimicking any voice and speaking it in various Swiss German dialects. The technology was recently featured on SRF’s science program Einstein and will be presented at the prestigious Interspeech conference.

At the CAI, our researchers have been exploring the frontier of speech synthesis, focusing on the challenge of adapting voices to regional dialects. The result is a powerful voice cloning system that can recreate a person’s voice using just a few seconds of audio—and then make that voice speak fluently in different Swiss German dialects.

This technology was recently showcased on Swiss national television as part of SRF’s Einstein program. For the broadcast, we cloned the voice of the Einstein moderator and had him read traditional stories from the Sarganserland region, spoken in the local dialect. Although the system hasn’t yet mastered every nuance of the dialect, the synthetic voice was realistic enough to fool several of the moderator’s colleagues.

The project highlights the potential of AI-powered voice adaptation in applications ranging from media production to language preservation. At CAI, we continue to refine the system, aiming for even more accurate and expressive dialect synthesis.

You can watch the full Einstein episode and read the paper to learn more about the technical foundations of our work.

The research behind this system has been accepted for presentation at Interspeech, the world’s largest conference on spoken language processing. The paper details the technical innovations behind the model, including its ability to adapt voices across dialects. Check out the preprint here

The research was supported by the DFF Initiative, and Samuel Stucki’s Master Thesis.