Computer Vision, Perception and Cognition Group

“AI is THE key technology of the digital transformation, across sectors and industries, with major effects on our societies. Our research thus makes major contributions to the development of robust and trustworthy AI methods, and we enthusiastically teach their safe implementation and application.”
Fields of expertise

- Pattern recognition with deep learning
- Machine perception, computer vision and speaker recognition
- Neural system development
The CVPC group conducts pattern recognition research, working on a wide variety of tasks relating to image, audio, and signal data per se. We focus on deep neural network and reinforcement learning methodology, inspired by biological learning. Each task we study has its own learning target (e.g., detection, classification, clustering, segmentation, novelty detection, control) and corresponding use case (e.g., predictive maintenance, speaker recognition for multimedia indexing, document analysis, optical music recognition, computer vision for industrial quality control, automated machine learning, deep reinforcement learning for automated game play or building control), which in turn sheds light on different aspects of the learning process. We use this experience to create increasingly general AI systems built on neural architectures.
Services
- Insight: keynotes, trainings
- AI consultancy: workshops, expert support, advise, technology assessment
- Research and development: small to large-scale collaborative projects, third party-funded research, student projects, commercially applicable prototypes
Team
Head of Research Group
Projects
-
AC3T – AI powered CBCT for improved Combination Cancer Therapy
The project enables a novel, combined, adaptive cancer therapy combining tumor treating field and radiation therapy due to significantly improved static (3D) and time-resolved (4D) low dose Cone Beam Computer Tomography images based on artificial intelligence image reconstruction algorithms. ...
-
AUTODIDACT – Automated Video Data Annotation to Empower the ICU Cockpit Platform for Clinical Decision Support
Monitoring diverse sensor signals of patients in intensive care can be key to detect potentially fatal emergencies. But in order to perform the monitoring automatically, the monitoring system has to know what is currently happening to the patient: if the patient is for example currently being moved by medical staff, ...
-
Good practices for responsible development of AI-based applications in healthcare
This project will identify proven methods, practices and standards that support responsible research and development of AI systems for health. They will be tested in use cases from medical imaging and neurotechnology, publicly released and published as a guideline of recommended best practices. ...
-
Pilot study machine learning for injection molding processes
Researchers from the CAI and InES conduct a technical deep dive together to explore the possibilities of capturing process knowledge on injection molding in deep neural networks and transfer the results to novel usage scenarios. The groups of Prof. Stadelmann (Computer Vision, Perception & Cognition, ZHAW CAI) and ...
-
Accessible Scientific PDFs for All
PDF is the most popular document format to provide and distribute information on the internet. It was developed by Adobe 1996 but has been an open format since 2008. It was estimated in 2015 that more than 2.5 trillion PDF documents exist on the internet, covering all aspects of life and research, and their number ...
Publications
-
Elezi, Ismail; Tuggener, Lukas; Pelillo, Marcello; Stadelmann, Thilo,
2018.
DeepScores and Deep Watershed Detection : current state and open issues [paper].
In:
Proceedings of the 1st International Workshop on Reading Music Systems.
1st International Workshop on Reading Music Systems at ISMIR 2018, Paris, France, 20 September 2018.
Paris:
Society for Music Information Retrieval.
pp. 13-14.
Available from: https://doi.org/10.21256/zhaw-4777
-
Stadelmann, Thilo; Glinski-Haefeli, Sebastian; Gerber, Patrick; Dürr, Oliver,
2018.
Capturing suprasegmental features of a voice with RNNs for improved speaker clustering [paper].
In:
Proceedings of the 8th IAPR TC3 Workshop on Artificial Neural Networks in Pattern Recognition (ANNPR).
8th IAPR TC3 Workshop on Artificial Neural Networks in Pattern Recognition (ANNPR), Siena, Italy, 19-21 September 2018.
IAPR.
pp. 333-345.
Lecture Notes in Computer Science ; 11081.
Available from: https://doi.org/10.1007/978-3-319-99978-4_26
-
Stadelmann, Thilo; Amirian, Mohammadreza; Arabaci, Ismail; Arnold, Marek; Duivesteijn, Gilbert François; Elezi, Ismail; Geiger, Melanie; Lörwald, Stefan; Meier, Benjamin Bruno; Rombach, Katharina; Tuggener, Lukas,
2018.
Deep learning in the wild [paper].
In:
Proceedings of the 8th IAPR TC3 Workshop on Artificial Neural Networks in Pattern Recognition (ANNPR).
8th IAPR TC3 Workshop on Artificial Neural Networks in Pattern Recognition (ANNPR), Siena, Italy, 19-21 September 2018.
IAPR.
Available from: https://doi.org/10.21256/zhaw-3872
-
Tuggener, Lukas; Elezi, Ismail; Schmidhuber, Jürgen; Stadelmann, Thilo,
2018.
Deep watershed detector for music object recognition [paper].
In:
Proceedings of the 19th International Society for Music Information Retrieval Conference.
19th International Society for Music Information Retrieval Conference, Paris, 23-27 September 2018.
Paris:
Society for Music Information Retrieval.
Available from: https://doi.org/10.21256/zhaw-3760
-
Tuggener, Lukas; Elezi, Ismail; Schmidhuber, Jürgen; Pelillo, Marcello; Stadelmann, Thilo,
2018.
DeepScores : a dataset for segmentation, detection and classification of tiny objects [paper].
In:
Proceedings of the 24th International Conference on Pattern Recognition.
24th International Conference on Pattern Recognition (ICPR 2018), Beijing, China, 20-28 August 2018.
Beijing:
IAPR.
pp. 1-6.
Available from: https://doi.org/10.21256/zhaw-4255