End-to-End Low-Resource Speech Translation for Swiss German Dialects
At a glance
- Project leader : Prof. Dr. Mark Cieliebak
- Co-project leader : Prof. Dr. Manfred Vogel
- Deputy of project leader : Dr. Jan Milan Deriu
- Project team : Michel Plüss, Dr. Tanja Samardzic
- Project budget : CHF 322'260
- Project status : ongoing
- Funding partner : SNSF (SNF-Projektförderung / Projekt Nr. 200729)
- Project partner : Fachhochschule Nordwestschweiz FHNW, Universität Zürich
- Contact person : Mark Cieliebak
This project investigates the application of recent findings in Speech Translation to Swiss German. Speech Translation (ST) is the task of translating spoken utterances in one language into written text in a different language. It serves as an essential tool for breaking down language barriers in various communication settings and a promising means in preserving endangered languages.
We will investigate how ST can be applied to Swiss German dialects, i.e. to translate speech in Swiss German into text in Standard German. This has numerous important real-life applications, e.g. voice bots such as Siri or Alexa, interview transcription, generating meeting protocols, evaluation of call-center dialogues, etc. The rationale behind transcribing to Standard German is that this will provide a unified written form and will allow us to benefit from the abundance of NLP methods that exist for Standard German as a well-studied high-resource language (e.g. part-of-speech tagging, named entity recognition, sentiment analysis, summarisation etc.).
ST is an appealing approach for Swiss German, since it does not rely on an intermediate textual representation in the source language (note that there is no unified written form for Swiss German). However, it usually requires a large amount of annotated training data, which is not available for the numerous Swiss German dialects. For this reason, we will investigate how ST systems could be built for Swiss German dialects without the need to generate annotated data for each dialect. More precisely, we will:
- create a large-scale parallel corpus of 450 hours of audio in 7 major Swiss dialects and corresponding translations in Standard German text. This corpus is generated in a fully controlled environment, allowing it to be used for scientifically sound experiments on Swiss German dialects
- implement 3 ST systems and run experiments on how to optimally train an ST system for different dialects.
- investigate how to translate training data between different dialects of Swiss German, using Speech-to-Speech translation, Vocabulary Enhancement and Voice Adaptation, to mitigate the lack of training data for most Swiss German dialects.
- compile a set of recommendations and best practices on how to create general purpose ST systems for Swiss German dialects.
The results of this project will pave the way for developing practical ST solutions for Swiss German on a broad scale and contribute to the prevalence of Swiss German as part of Switzerland’s cultural heritage in the digital age.
Plüss, Michel; Hürlimann, Manuela; Cuny, Marc; Stöckli, Alla; Kapotis, Nikolaos; Hartmann, Julia; Ulasik, Malgorzata Anna; Scheller, Christian; Schraner, Yanick; Jain, Amit; Deriu, Jan Milan; Cieliebak, Mark; Vogel, Manfred,
Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022).
13th Language Resources and Evaluation Conference (LREC), Marseille, France, 20-25 June 2022.
European Language Resources Association.
Available from: https://doi.org/10.21256/zhaw-26131