Eingabe löschen

Kopfbereich

Hauptnavigation

School of Engineering

Visual Intelligence and Applications

As visual data becomes one of the most abundant and complex sources of information, Visual Intelligence is a key pillar of modern data science — enabling new ways to analyze, model, and communicate through images, video, and immersive environments.

Teaching Machines how to See — and Helping Humans Understand

To become a true professional, one must develop expertise through a combination of core competences (knowledge), practical proficiency, and real-world experience.

  • Our approach to human skill development emphasizes collaborative and communicative learning, supported by interactive and immersive technologies. We provide training opportunities that bridge both the physical and virtual worlds, ensuring deep and lasting learning outcomes.
  • Our research initiatives push the boundaries of what's possible. Collaborate with esteemed faculty on projects in automated image interpretation, machine learning, visual communication, and more.
  • Our work doesn't just stay in the lab — it has real-world impact, influencing everything from healthcare diagnostics to the arts.

Research & Projects

Visual Interestingness

Interestingness -- the power of attracting or holding one's attention.

Our daily life is greatly influenced by what we consume and see. On one hand, we decide based on our personal interests which news, movies, or other events we focus our attention on. On the other hand, most people are also very open to external visual stimuli that could influence their behavior. To learn more about human visual perception and its effects on judging events as interesting, but also for commercial purposes, it is of great interest to understand what triggers human attention and interest.

For instance, models of what people consider 'interesting' could be used to automatically analyze video streams in video surveillance applications and alert users. Or it can help people in their work by automatically highlighting 'interesting' facts that might otherwise have been overlooked. This is especially important in time-critical scenarios where someone needs to quickly get an overview of many facts, such as in medical emergency care.

The Effect of Implied Motion

Over the past several decades, visual imagery has become the dominant element in modern advertising. A common content strategy involves depicting humans, animals, or objects in the midst of motion. Whereas previous research indicates that implied motion images enhance persuasion, it is unclear whether this effect is unique to depictions of moving humans or if it also applies to depictions of moving animals (e.g., a dolphin jumping out of the water) and moving objects (e.g., a car driving on a street, a burger being tossed in the air). Across a set of seven experimental studies, we provide robust evidence that images depicting animate and inanimate motion increase the persuasiveness of an advertisement and that this effect occurs through enhanced engagement. Our findings further indicate that the level of engagement is influenced by the complexity of the depicted motion, with more complex, nonlinear movements eliciting greater engagement than simpler, linear movements. Overall, this research contributes to the advertising literature by providing an empirically grounded account of implied motion imagery and by helping marketers create more effective advertising.

Watching the World

All images are equal but some images are more equal than others.

"WATCHING THE WORLD, The Encyclopedia Of the Now" is an art, photography, exhibition, AI, big data, and online project that exclusively uses open data sources. It photographs the world simultaneously in live mode around the clock and across the globe using publicly accessible webcams, presenting these recordings in real-time on the website in various modes and developing a new way of seeing, a new form of photography with the help of AI.

On the webpage https://webcamaze.engineering.zhaw.ch, more than 10,000 webcams are analyzed in real time. If the images were printed out, they would stack up to the height of the Great Pyramid of Giza – every day! Without methods of machine learning and automatic image processing, such volumes could no longer be managed.

Most of the time, there is nothing "interesting" to see – but if you are at the right place at the right time, you see surprising, unexpected, bizarre, perhaps questionable images that invite reflection and discussion.

"SPY BOT 2000 - We believe the version of  the arty bollocks write up is that this loads live webcam images from around the world and gets an AI to sort them into categories. Actually quite an interesting use of AI for a change, if a little creepy." -- b3ta.com

1 camera, 1 bird, 1 image

AWARD: SNSF Scientific Image Competition 2025 - Jury distinction

Jury’s commentary | Frozen in flight and defying gravity, this bird watching us offers a tongue-in-cheek reflection on the observer being observed. The partly humorous, partly absurd shot addresses the role of artificial intelligence and the deluge of images produced by the proliferating webcams and security cameras around the world, questioning the modern practice of photography.

IMMERSE: Immersive Education and Science Exploration

Extended reality (XR) learning environments are developed to spark curiosity, encourage exploration, and inspire creativity in learners. At the same time, educators are empowered by these tools to present their topics in an immersive and interactive way.

Close collaboration is maintained with the University of Zurich (UZH) and the Zurich University of Teacher Education (PH Zurich). Additional support is provided by practice partners such as Microsoft, Siemens, and Magic Leap, whose specialized expertise in XR technology, 3D data acquisition, and cloud solutions plays a key role. Through their contributions, ideas are transformed from early design and prototyping stages into fully realized applications. Applications are created across a variety of educational fields. 

To support learners and educators in their respective environments, applications are developed with cross-platform use in mind. A common XR platform is being aimed for, allowing projects to be presented to individuals using VR or AR headsets, to groups via smartphones, or to entire classrooms through the use of 3D beamers. These tools allow subjects to be explored from fresh and exciting perspectives.

Surgical Proficiency

It initiates the paradigm shift in surgical training for optimally prepared surgeons and even greater patient safety.

Training on patients according to the principle of “see one, do one and teach one” no longer corresponds to today’s requirements and technical opportunities regarding education of surgical residents. During the Covid-19 pandemic, most hands-on training had to be discontinued, leading to an almost complete interruption of surgical education.   Under the lead of the three clinical partners Kantonsspital St.Gallen, Centre Hospitalier Universitaire Vaudois, and Balgrist University Hospital, endorsed by the Swiss Surgical Societies, novel standardized and proficiency-based surgical training curricula are defined and interfaced to simulation tools. The four implementation partners VirtaMed AG, Microsoft Mixed Reality & AI Lab Zurich, OramaVR SA and Atracsys LLC, in collaboration with ETHZ and ZHAW develop these innovative training tools ranging from online virtual reality simulation, augmented box trainers, high-end simulators, to augmented-reality-enabled open surgery and immersive remote operation room participation. The proposed developments introduce a fully novel, integrative training paradigm installed and demonstrated on two example surgical modalities, laparoscopy and arthroscopy, while fully generalizable to other interventions. This will set new standards both in Switzerland and abroad.

  • R. Lekar, T. Gerth, S. Prokudin, M. Seibold, R. Bürgin, B. Vella, A. Hoch, S. Tang, P. Fürnstahl, and H. Grabner. Enhancing Orthopedic Surgical Training With Interactive Photorealistic 3D Visualization. Annual Meeting of the International Society for Computer Assisted Orthopaedic Surgery (CAOS), 2025, project
  • L. Wu, M. Seibold, N. Cavalcanti, J. Hein, T. Gerth, R. Lekar, A. Hoch, L. Vlachopoulos, H. Grabner, P. Zingg, M. Farshad, and P. Fürnstahl, A novel augmented reality-based simulator for enhancing orthopedic surgical training. Computers in Biology and Medicine, Volume 185, 2025
  • https://www.surgicalproficiency.ch
  • ZHAW Impact
  • OR-X, ROCS Balgrist
  • ETH Computer Vision and Learning Group

Teaching

“We are Teaching Students how to Teach Machines how to See.”

Lectures

Together with the Institute of Computer Science we offer basic and advanced courses on Visual Computing.

Student Project Offers (BSc & MSc)

A selection of offered student projects is listed below. BSc projects and bachelor theses can also be found at Complesis. We are happy to discuss your own ideas as well!

Interactive Game Wall – Tic Tac Toe with Computer Vision
Contact: martin.frey@zhaw.ch 

This project explores a novel interactive game wall where Tic Tac Toe is played by kicking a football at projection surfaces. Using only cameras, a projector, and computer vision — no extra sensors — the system detects ball impacts and displays X or O at the hit location, enabling intuitive, digital play.

Watching the World – Finding the Needle in a Haystack
Contact: fitim.abdullahu@zhaw.ch

This project explores the world through more than 10,000 publicly available webcams. In a flood of over 1 million images per day, most scenes are ordinary. Using AI and computer vision, we aim to detect rare and interesting moments as they happen — finding the "needle in the haystack" in real time.

Nail It! – Traditional Game Reimagined in AR
Contact: tatiana.gerth@zhaw.ch

Using markerless AR and motion sensors, this project brings the traditional “Nageln” game into augmented reality. Using just a smartphone, players take turns hammering virtual nails into a trunk - no controllers, no accounts, no setup. Pass the phone around for a fun, intuitive group experience.

Window to the World
Contact: matthias.karst@zhaw.ch

Build an installation using a screen, a face tracking camera and AI based image segmentation that functions as a virtual window to anywhere in the world. As you move your head in front of the window the displayed image must move accordingly. Just like with a real window.

Finding the Suspect 
Contact: matthias.karst@zhaw.ch

Build a VR game where you search for suspects all around the world, using thousands of webcams. Select an object in one image and receive dozens of new ones showing an object with similar shape, color or type. Starting from a random image, you will have to make smart connections to find the suspect.

Movement Classification of Functional Fitness Videos
Contact: basil.achermann@zhaw.ch

In Functional Fitness Competitions, athletes qualify by submitting workout videos judged for movement standards. This process presents two opportunities: automating evaluations with computer vision for efficiency and objectivity, and leveraging the footage for sports science research.

Movement Classification using Mobile Measurements
Contact: basil.achermann@zhaw.ch

We're using IMU data for Human Activity Recognition, focusing on strength and sport-specific movements to analyze performance, technique, and fatigue. What movements are you interested in tracking? From fine motor skills to full-body actions, we welcome your ideas and collaboration.

Team

“Great things in business are never done by one person, they are done by a team of people” ― Steve Jobs.

Former Members