With public support: ZHAW and FHNW digitalise Swiss dialects
Both universities call for support on collecting Swiss dialects in German-speaking Switzerland. As of now, the public can help by recording language samples with an app. The digitalised dialects can train important computer programmes.
Chatbots and voice assistants like Siri or Alexa find it difficult to understand spoken Swiss German. That is due to the lack of audio files needed to train such systems. For technology companies like Google, Apple or Amazon, the Swiss market is too small to develop a solution. That is about to change. “We want to collect and digitalise Swiss dialects,” says Mark Cieliebak from the ZHAW Centre for Artificial Intelligence (CAI). “We want to collect at least 2000 hours of recorded Swiss-German dialects to have a good database.” Manfred Vogel of the FHNW adds: “We will use the collected data to train an AI-based algorithm to understand Swiss German and to transfer it automatically to High German text.”
«We want to collect and digitalise Swiss dialects.»
Mark Cieliebak, ZHAW-Centre for Artificial Intelligence (CAI)
A web application was developed for the project with which volunteers can record their own audio samples by translating High German sentences into dialects and/or verify the recordings of other participants. In the app, the participants can see from which Cantons other participants have submitted their audio files – and which Cantons and dialects are still missing.
With the collected data and the trained models, voice interfaces to different applications can be developed. For example, it would become possible to speak Swiss German with a voice assistant, and companies could evaluate customer feedback automatically, such as calls to customer service. Subtitles for tv shows could be generated automatically as well.
The technologies for training speech-to-text systems have been continuously developed in recent years and are now mostly based on neural networks. For languages such as English and German, these methods already deliver very good results with error rates of less than two per cent.
«Because we publish the data set for research purposes, computer programmes for different applications can be developed in cooperation with local companies», explains Manuela Hürlimann of the Swiss Association for Natural Language Processing (SwissNLP), who coordinates and leads the project.
Prof. Dr. Mark Cieliebak, Head Natural Language Processing Group, Centre for Artificial Intelligence, ZHAW School of Engineering, Tel. 058 934 72 39, E-Mail email@example.com
Prof. Dr. Manfred Vogel, Head Information Processing, Institute for Data Science, FHNW School of Engineering, Tel. 056 202 77 36, E-Mail firstname.lastname@example.org
Manuela Hürlimann, Research Associate, Centre for Artificial Intelligence, ZHAW School of Engineering, Tel. 058 934 45 07, E-Mail email@example.com
Dr. Bettina Mack, Content-/ Media Manager and Science Writer, ZHAW Zurich University of Applied Sciences, Tel. +41 58 934 6026, E-Mail firstname.lastname@example.org