Machine Learning in Visual Computing | ZHAW Institute of Data Analysis and Process Design IDP

Computer Vision and Machine Learning are at the heart of the digital revolution, transforming how we interact with technology and the world around us. We are dedicated to advancing these dynamic fields through comprehensive education, research, and practical applications.

Teaching Machines how to See

Our research initiatives push the boundaries of what's possible. Collaborate with esteemed faculty on projects in automated image interpretation, machine learning, visual communication, and more. Our work doesn't just stay in the lab – it has real-world implications, influencing everything from healthcare diagnostics to art.

Teaching Students how to Teach Machines how to See. Together with the Institute of Computer Science we offer basic and advanced courses on Visual Computing.

Visual Interestingness

DALL·E 3: “Give me an image which shows the contrast between interesting and uninteresting [...]”. In the generated image, many colors are used for the interesting part, which is usually overfilled with objects and is generally quite complex. However, the uninteresting part is usually depicted with few objects, monotonous and colorless, which gives a tendency towards simplicity. Would you agree? Does this hold in general?

Interestingness -- the power of attracting or holding one's attention.

Our daily life is greatly influenced by what we consume and see. On one hand, we decide based on our personal interests which news, movies, or other events we focus our attention on. On the other hand, most people are also very open to external visual stimuli that could influence their behavior. To learn more about human visual perception and its effects on judging events as interesting, but also for commercial purposes, it is of great interest to understand what triggers human attention and interest.

For instance, models of what people consider 'interesting' could be used to automatically analyze video streams in video surveillance applications and alert users. Or it can help people in their work by automatically highlighting 'interesting' facts that might otherwise have been overlooked. This is especially important in time-critical scenarios where someone needs to quickly get an overview of many facts, such as in medical emergency care.

ZHAW Media Linguistics
T. Koller, and H. Grabner, Who wants to be a click-millionaire? On the influence of thumbnails and captions. In Proc. IEEE International Conference on Pattern Recognition (ICPR), 2022
F. Bünzli, W. Weber, F. Abdullahu, and H. Grabner, Do Vectors of Motion Make Advertisements More Interesting? Annual Conference of the International Communication Association (ICA), 2023, to appear

Watching the World

Right now, thousands of publicly accessible network cameras are capturing images of the world. Recently, we personally visited four of these cameras. It was an exciting, inspiring, and highly creative 'traveling' team retreat across Switzerland, part of the art-meets-science project: Watching the World.

All images are equal but some images are more equal than others.

"WATCHING THE WORLD, The Encyclopedia Of the Now" is an art, photography, exhibition, AI, big data, and online project that exclusively uses open data sources. It photographs the world simultaneously in live mode around the clock and across the globe using publicly accessible webcams, presenting these recordings in real-time on the website in various modes and developing a new way of seeing, a new form of photography with the help of AI.

On the webpage https://webcamaze.engineering.zhaw.ch, more than 10,000 webcams are analyzed in real time. If the images were printed out, they would stack up to the height of the Great Pyramid of Giza – every day! Without methods of machine learning and automatic image processing, such volumes could no longer be managed.

Most of the time, there is nothing "interesting" to see – but if you are at the right place at the right time, you see surprising, unexpected, bizarre, perhaps questionable images that invite reflection and discussion.

„

"SPY BOT 2000 - We believe the version of the arty bollocks write up is that this loads live webcam images from around the world and gets an AI to sort them into categories. Actually quite an interesting use of AI for a change, if a little creepy." -- b3ta.com

Surgical Proficiency

We create an AR-based simulator for open orthopedic surgery training, specifically focusing on Total Hip Arthroplasty. This simulator represents a significant advancement over conventional cadaver-based training as it
integrates adaptive AR guidance and AI techniques. By harnessing the power of intuitive AR instructions and leveraging AI-driven analysis of human activity and behavior, our aim is to optimize the usability of surgical training simulators.

It initiates the paradigm shift in surgical training for optimally prepared surgeons and even greater patient safety.

Training on patients according to the principle of “see one, do one and teach one” no longer corresponds to today’s requirements and technical opportunities regarding education of surgical residents. During the Covid-19 pandemic, most hands-on training had to be discontinued, leading to an almost complete interruption of surgical education. Under the lead of the three clinical partners Kantonsspital St.Gallen, Centre Hospitalier Universitaire Vaudois, and Balgrist University Hospital, endorsed by the Swiss Surgical Societies, novel standardized and proficiency-based surgical training curricula are defined and interfaced to simulation tools. The four implementation partners VirtaMed AG, Microsoft Mixed Reality & AI Lab Zurich, OramaVR SA and Atracsys LLC, in collaboration with ETHZ and ZHAW develop these innovative training tools ranging from online virtual reality simulation, augmented box trainers, high-end simulators, to augmented-reality-enabled open surgery and immersive remote operation room participation. The proposed developments introduce a fully novel, integrative training paradigm installed and demonstrated on two example surgical modalities, laparoscopy and arthroscopy, while fully generalizable to other interventions. This will set new standards both in Switzerland and abroad.

Target Recognition using Artificial Intelligence

X-ray images at airports are either checked manually or analyzed using object detectors. Pistols, pistol parts, ammunition and knives are recognized. Commercial systems are closed source systems and operate opaquely to the end user. Manufacturers of AI systems advertise high hit rates and low false alarm rates. However, these numbers cannot be interpreted meaningfully unless something it is known which methods and data were used to evaluate the systems.
An independent and scientifically based evaluation of the systems important. This project is a collaboration between Casra, FHNW and the ZHAW. The FHNW examines the project from a psychological point of view. In particular, human-machine interaction is analyzed. Casra has the project lead and creates X-ray images for training and testing. Our part is to create, train and test different kind of state-of-the art object detectors. The goal is to identify the strengths and weaknesses of the entire systems. This is intended to create a proposal for certification of commercial systems. Is it possible to certify object detectors separately from X-ray machines?

Detection of Yellow Nutsedge with Drones and Algorithms

Digital agriculture holds much promise: Intelligent drones are supposed to detect weeds, pests, or nutrient deficiencies, translate the collected information into application maps, thus reducing or even replacing the time-consuming task of walking through fields.

Yellow nutsedge (Erdmandelgras, Cyperus esculentus) is an invasive neophyte that causes great damage to agriculture in Switzerland. The earlier the yellow nutsedge is detected, the easier it is to remediate affected fields. There is currently no organized monitoring; the weed is typically discovered by chance. The aim of this project is to use drone images and computer vision methods to detect yellow nutsedge from the air. The project partners are the Institute of Natural Resource Sciences (Geoinformatics Research Group) and Agroscope. The aerial images of various infested fields with different crops are enriched with geocoordinates and manually annotated by experts. The produced images are utilized for training a cutting-edge deep neural network, enabling it to identify and pinpoint yellow nutsedge in new, unseen images, which are then mapped onto an overview orthophoto