Delete search term

Header

Quick navigation

Main navigation

End-to-End Low-Resource Speech Translation for Swiss German Dialects

At a glance

Description

This project investigates the application of recent findings in Speech Translation to
Swiss German. Speech Translation (ST) is the task of translating spoken utterances in
one language into written text in a different language. It serves as an essential tool for
breaking down language barriers in various communication settings and a promising
means in preserving endangered languages.

We will investigate how ST can be applied to Swiss German dialects, i.e. to translate
speech in Swiss German into text in Standard German. This has numerous important
real-life applications, e.g. voice bots such as Siri or Alexa, interview transcription,
generating meeting protocols, evaluation of call-center dialogues, etc. The rationale
behind transcribing to Standard German is that this will provide a unified written form
and will allow us to benefit from the abundance of NLP methods that exist for Standard
German as a well-studied high-resource language (e.g. part-of-speech tagging, named
entity recognition, sentiment analysis, summarisation etc.).

ST is an appealing approach for Swiss German, since it does not rely on an
intermediate textual representation in the source language (note that there is no
unified written form for Swiss German). However, it usually requires a large amount of
annotated training data, which is not available for the numerous Swiss German
dialects. For this reason, we will investigate how ST systems could be built for Swiss
German dialects without the need to generate annotated data for each dialect. More
precisely, we will
1. create a large-scale parallel corpus of 450 hours of audio in 7 major Swiss dialects
and corresponding translations in Standard German text. This corpus is generated in a
fully controlled environment, allowing it to be used for scientifically sound experiments
on Swiss German dialects.
2. implement 3 ST systems and run experiments on how to optimally train an ST
system for different dialects.
3. nvestigate how to translate training data between different dialects of Swiss German,
using Speech-to-Speech translation, Vocabulary Enhancement and Voice Adaptation,
to mitigate the lack of training data for most Swiss German dialects.
4. compile a set of recommendations and best practices on how to create general
purpose ST systems for Swiss German dialects.

The results of this project will pave the way for developing practical ST solutions for
Swiss German on a broad scale and contribute to the prevalence of Swiss German as
part of Switzerland’s cultural heritage in the digital age.