Delete search term


Quick navigation

Main navigation

End-to-End Low-Resource Speech Translation for Swiss German Dialects

At a glance


This project investigates the application of recent findings in Speech Translation to Swiss German. Speech Translation (ST) is the task of translating spoken utterances in one language into written text in a different language. It serves as an essential tool for breaking down language barriers in various communication settings and a promising means in preserving endangered languages.

We will investigate how ST can be applied to Swiss German dialects, i.e. to translate speech in Swiss German into text in Standard German. This has numerous important real-life applications, e.g. voice bots such as Siri or Alexa, interview transcription, generating meeting protocols, evaluation of call-center dialogues, etc. The rationale behind transcribing to Standard German is that this will provide a unified written form and will allow us to benefit from the abundance of NLP methods that exist for Standard German as a well-studied high-resource language (e.g. part-of-speech tagging, named entity recognition, sentiment analysis, summarisation etc.).

ST is an appealing approach for Swiss German, since it does not rely on an intermediate textual representation in the source language (note that there is no unified written form for Swiss German). However, it usually requires a large amount of annotated training data, which is not available for the numerous Swiss German dialects. For this reason, we will investigate how ST systems could be built for Swiss German dialects without the need to generate annotated data for each dialect. More precisely, we will:

  1. create a large-scale parallel corpus of 450 hours of audio in 7 major Swiss dialects and corresponding translations in Standard German text. This corpus is generated in a fully controlled environment, allowing it to be used for scientifically sound experiments on Swiss German dialects
  2. implement 3 ST systems and run experiments on how to optimally train an ST system for different dialects.
  3. investigate how to translate training data between different dialects of Swiss German, using Speech-to-Speech translation, Vocabulary Enhancement and Voice Adaptation, to mitigate the lack of training data for most Swiss German dialects.
  4. compile a set of recommendations and best practices on how to create general purpose ST systems for Swiss German dialects.

The results of this project will pave the way for developing practical ST solutions for Swiss German on a broad scale and contribute to the prevalence of Swiss German as part of Switzerland’s cultural heritage in the digital age.