Swiss-AL: the multilingual Swiss Corpus for Applied Linguistics
Swiss-AL is a multilingual corpus of Swiss public communication that allows for the data-based analysis and simulation of social discourses.

The Swiss Corpus for Applied Linguistics (Swiss-AL) is a linguistically processed, multilingual collection of texts from key stakeholders in the field of Swiss public communication. It allows for data-based and data-driven research to be conducted into social discourses in Switzerland. The corpus contains texts from Swiss daily and weekly newspapers and online news portals, as well as press releases and news from the blog posts of stakeholders in the areas of politics, business, academia and civil society. A flexible processing pipeline makes it possible to model tailor-made sub-corpora for studying discourses.
Areas of application
Swiss-AL is used by the ZHAW’s Digital Discourse Lab as part of applied research projects on public discourses in Switzerland. The projects include the following:
The media data contained in Swiss-AL also forms the basis for the selection of Switzerland’s Word of the Year in German, French, Italian and Romansh.
In addition, all of the degree programmes in the School of Applied Linguistics use the corpus for research-based teaching, for example when preparing theses at bachelor’s and master’s level.
Access
Selected sub-corpora of Swiss-AL can be used with the help of the specially developed Swiss-AL workbench.
- Access for the public: https://swiss-al.linguistik.zhaw.ch
- Access for ZHAW staff and students: http://tools.linguistik.zhaw.ch
Composition
Swiss-AL-Base sub-corpus
- Contents: Swiss daily and weekly newspapers, online news portals and websites (press releases, news, blog posts) from the areas of politics, business, academia and civil society
- Period: 2010 to 2021
- Languages: German, French, Italian
- Volume (as at November 2021): German: 2.15 billion words, French: 1 billion words, Italian: 14 million words
Swiss-AL-Media sub-corpus
- Contents: Swiss daily and weekly newspapers, online news portals
- Period: 2016 to 2021
- Languages: German, French, Italian, Romansh
- Volume (as at November 2021): German: 1.2 billion words, French: 381 million words, Italian: 150 million words, Romansh: 5.9 million words
Swiss-AL-Covid19 sub-corpus
Thematic corpus on COVID-19 in Switzerland
- Contents: Swiss daily and weekly newspapers, online news portals and websites (press releases, news, blog posts) from the areas of politics, business, academia and civil society
- Period: 2020
- Languages: German, Italian
- Volume (as at November 2021): German: 23 million words, Italian: 6.7 million words
Publications
- Krasselt, J., Fluor, M., Rothenhäusler, K., & Dreesen, P. (2021). A workbench for corpus linguistic discourse analysis. In D. Gromann, G. Sérasset, T. Declerck, J. P. McCrae, J. Gracia, J. Bosque-Gil, F. Bobillo, & B. Heinisch (Hrsg.), 3rd conference on language, data and knowledge (LDK 2021) (Bd. 93, S. 26:1-26:9). Schloss Dagstuhl – Leibniz-Zentrum für Informatik. https://doi.org/10.4230/OASIcs.LDK.2021.26
- Krasselt, Julia, Philipp Dreesen, Matthias Fluor, Cerstin Mahlow, Klaus Rothenhäusler & Maren Runte. 2020. Swiss-AL: A Multilingual Swiss Web Corpus for Applied Linguistics. In Proceedings of The 12th Language Resources and Evaluation Conference, 4138--4144. Marseille, France. https://aclanthology.org/2020.lrec-1.510.