End-to-End Low-Resource Speech Translation for Swiss German Dialects

Description

This project investigates the application of recent findings in Speech Translation toSwiss German. Speech Translation (ST) is the task of translating spoken utterances inone language into written text in a different language. It serves as an essential tool forbreaking down language barriers in various communication settings and a promisingmeans in preserving endangered languages. We will investigate how ST can be applied to Swiss German dialects, i.e. to translatespeech in Swiss German into text in Standard German. This has numerous importantreal-life applications, e.g. voice bots such as Siri or Alexa, interview transcription,generating meeting protocols, evaluation of call-center dialogues, etc. The rationalebehind transcribing to Standard German is that this will provide a unified written formand will allow us to benefit from the abundance of NLP methods that exist for StandardGerman as a well-studied high-resource language (e.g. part-of-speech tagging, namedentity recognition, sentiment analysis, summarisation etc.). ST is an appealing approach for Swiss German, since it does not rely on anintermediate textual representation in the source language (note that there is nounified written form for Swiss German). However, it usually requires a large amount ofannotated training data, which is not available for the numerous Swiss Germandialects. For this reason, we will investigate how ST systems could be built for SwissGerman dialects without the need to generate annotated data for each dialect. Moreprecisely, we will:

create a large-scale parallel corpus of 450 hours of audio in 7 major Swiss dialectsand corresponding translations in Standard German text. This corpus is generated in afully controlled environment, allowing it to be used for scientifically sound experimentson Swiss German dialects
implement 3 ST systems and run experiments on how to optimally train an STsystem for different dialects.
investigate how to translate training data between different dialects of Swiss German,using Speech-to-Speech translation, Vocabulary Enhancement and Voice Adaptation,to mitigate the lack of training data for most Swiss German dialects.
compile a set of recommendations and best practices on how to create generalpurpose ST systems for Swiss German dialects.

The results of this project will pave the way for developing practical ST solutions forSwiss German on a broad scale and contribute to the prevalence of Swiss German aspart of Switzerland’s cultural heritage in the digital age.