Key areas of research
“Methods for computer vision detect cancer cells or help blind people to navigate through town, chatbots answer customer questions 24 hours a day, and self-driving cars are set to radically change our transport systems. Our research strives for excellence through practical applicability.”
At the CAI, we focus on machine learning and deep learning methodology. In our experience, breakthroughs in one use case tend to translate well to different domains, current AI methodology being largely sector-independent. We apply our expertise in the following areas:
Health and medicine, industry 4.0, robotics, predictive maintenance, automated quality control, document analysis, and other data science use cases in industries including manufacturing, finance and insurance, retail, transportation, digital farming, weather forecasting, earth observation and many more.
- Reinforcement learning
- Multi-agent systems
- Embodied AI
In the field of autonomous learning systems research, we investigate the design and development of intelligent systems, specifically the kinds that create a feedback loop between perception (processing of incoming sensor data) and action (execution of actions that influence the environment to be perceived: perception-action loop). An important methodology in this context is (deep) reinforcement learning, which allows agents to learn through trial and error. In the future, this reward-based type of learning will create pathways to completely new areas of application beyond the traditional learning from pairs of input and hand-engineered output in most industries, for example in industrial production or in the field of neurotechnology. Interconnecting such systems with hardware equipped with the required sensors and actuators creates additional training potential for the algorithms of autonomous systems through physical interaction (embodiment, e.g. in a robotic device).
For the globally successful "Farming simulator" video game series developed by GIANTS Software GmbH, a new, continually entertaining, easily extendable game mode has been made possible through artificial intelligence (AI). In this project, reinforcement learning algorithms are used to find suitable action strategies by simulating games.
While in living beings, vision is an active process whereby image acquisition and classification are intertwined to gradually refine perception, much of today’s computer vision is built on the inferior paradigm of episodic classification of i.i.d. samples. Our aim is to enable improved scene understanding for robots while taking the sequential nature of seeing over time into account. We present a combined supervised and reinforcement learning multi-task approach to answer questions about different aspects of a scene, such as the relationship between objects, their quantity or their relative positions to the camera.
- Pattern recognition
- Machine perception
- Neuromorphic engineering
The focus of the computer vision, perception and cognition area is on generating semantic understanding from high-dimensional input. This is achieved by learning and then finding essential patterns using machine learning methodology and, specifically, deep neural networks in a data-driven way (pattern recognition). Input sources include data from images, videos and other multimedia signals, but also multi-dimensional data series from any technical and non-technical field. Methodologically, classification, semantic segmentation and object detection play a role in the analysis of the input or corresponding generative models for their synthesis (e.g. using generative adversarial networks). Biology-inspired ideas from the field of neuroscience are used to further develop the methodology (neuromorphic engineering).
We have built a sheet music scanning service that ensures high quality input. To increase market penetration, we plan to extend its use to smartphone images, used sheets and other sources. Project RealScore enhances the successful precursor project by making deep learning adapt to previously unseen data through unsupervised learning.
Due to the methodologies vast success on a wide range of machine perception tasks, deep learning with neural networks is applied by an increasing number of actors outside classic research environments. While this interest is fuelled by exciting success stories, practical work in deep learning on novel tasks without existing baselines remains challenging. This paper explores specific challenges that arise in the realm of real-world tasks, based on case studies from research and development in conjunction with industry, and extracts the lessons learned.
- Trustworthy machine learning,
- Robust deep learning,
In the focus area of explainable AI, we investigate deep learning methods that meet the special requirements of professional (e.g. industrial or medical) practice. On the one hand, this means achieving results that are robust to slightly varying in input (e.g. due to creeping changes in the environment - covariate shift - or adversarial attacks) or robust despite small training sets (small data sets or learning from little, e.g. through better transfer learning and the use of self-supervised and unsupervised learning). On the other hand, it is important to make the learned models as well as the training process itself explainable (explainable AI). Firstly, this wins the trust of a user or affected person in the system (trustworthy AI) and can be a regulatory prerequisite for the use in certain fields of application. Secondly, it serves the development process of corresponding models and systems (ML debugging, MLOps).
QualitAI researches and develops a device for the automatic quality control of industrial products such as cardiac balloon catheters. This is facilitated through innovation in analysing camera images using deep learning, specifically in rendering the resulting model robust and interpretable.
The existence of adversarial attacks on convolutional neural networks (CNN) raises doubts about the fitness of such models for serious applications. Such attacks can manipulate input images so that misclassification is evoked while the images continue to look normal to the human observer—they are therefore not easily detectable. In a different context, backpropagated activations of CNN hidden layers—“feature responses” to a given input—have been helpful in visualising for a human “debugger” what the CNN “looks at” while computing its output. In this paper, we propose a novel detection method for adversarial examples to prevent attacks.
- Dialogue systems
- Text analytics
- Speech Processing
The focus area natural language processing investigates the machine understanding of human speech in spoken (spoken language processing, automatic speech recognition, speaker diarization) and written form (natural language processing, text analytics) as well as the use of corresponding machine learning methods, such as transformers. This research is conducted in the context of dialogue systems and aims to enable natural language communication between humans and machines. Particular attention is paid to the development and availability of adapted methods and models for dialects and rare languages for which only limited training data is available. Further research areas include text classification (e.g. sentiment analysis), chatbots and natural language generation.
By developing child-like avatars for the training of interrogators of children, this project manages to close significant knowledge gaps regarding the effectiveness of individual training elements and personal influencing variables. The findings and the training tool can be used for education and training purposes as well as for the recruitment of personnel. The project’s findings can be used as a basis to improve interrogation practices and help to meet the international demand for a child-friendly justice system.
In this paper, we survey methods and concepts developed for the evaluation of dialogue systems. Evaluation, in and of itself, is a crucial part of the development process. Often, dialogue systems are assessed by means of human evaluation and questionnaires. However, these methods tend to be very expensive and time-consuming, which is why there have been intense efforts to find methods that reduce the involvement of human labour.
Stadelmann, Thilo; Klamt, Tino; Merkt, Philipp H.,
Archives of Data Science, Series A.
Available from: https://doi.org/10.21256/zhaw-24611
Batzianoulis, Iason; Iwane, Fumiaki; Wei, Shupeng; Correia, Carolina Gaspar Pinto Ramos; Chavarriaga, Ricardo; Millán, José del R.; Billard, Aude,
Available from: https://doi.org/10.1038/s42003-021-02891-8
Eke, Damian O; Bernard, Amy; Bjaalie, Jan G; Chavarriaga, Ricardo; Hanakawa, Takashi; Hannan, Anthony J; Hill, Sean L; Martone, Maryann E; McMahon, Agnes; Ruebel, Oliver; Crook, Sharon; Thiels, Edda; Pestilli, Franco,
110(4), pp. 600-612.
Available from: https://doi.org/10.1016/j.neuron.2021.11.017
Proceedings of CISP-BMEI’21.
14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Shanghai, China, 23-25 October 2021.
ZHAW Zürcher Hochschule für Angewandte Wissenschaften.
Available from: https://doi.org/10.21256/zhaw-23318
Swiss Text Analytics Conference – SwissText 2021, Online, 14-16 June 2021.
CEUR Workshop Proceedings.
Available from: http://ceur-ws.org/Vol-2957/
AC3T – AI powered CBCT for improved Combination Cancer Therapy
The project enables a novel, combined, adaptive cancer therapy combining tumor treating field and radiation therapy due to significantly improved static (3D) and time-resolved (4D) low dose Cone Beam Computer Tomography images based on artificial intelligence image reconstruction algorithms. ...
AUTODIDACT – Automated Video Data Annotation to Empower the ICU Cockpit Platform for Clinical Decision Support
Monitoring diverse sensor signals of patients in intensive care can be key to detect potentially fatal emergencies. But in order to perform the monitoring automatically, the monitoring system has to know what is currently happening to the patient: if the patient is for example currently being moved by medical staff, ...
Good practices for responsible development of AI-based applications in healthcare
This project will identify proven methods, practices and standards that support responsible research and development of AI systems for health. They will be tested in use cases from medical imaging and neurotechnology, publicly released and published as a guideline of recommended best practices. ...
Pilot study machine learning for injection molding processes
Researchers from the CAI and InES conduct a technical deep dive together to explore the possibilities of capturing process knowledge on injection molding in deep neural networks and transfer the results to novel usage scenarios. The groups of Prof. Stadelmann (Computer Vision, Perception & Cognition, ZHAW CAI) and ...
DOSSMA – Detection of Suspicious Social Media Activities
The DOSSMA project will investigate suspicious and malicious behaviour on social media platforms. In a first phase, we will compile an extensive survey report on the areas that are currently being researched, including the respective state-of-the-art, existing solutions and initiatives. This report will serve as a ...