New paper introduces a method to robustly homogenize data targeting COVID detection from CT images
The PrepNet neural network architecture homogenizes the appearance of CT images from different source datasets and hospitals so that more robust and trustworthy classifiers can be trained based on larger datasets.
CAI researchers Mohammadreza Amirian and Dr. Javier Montoya, with their teams, developed a new method to overcome the problem of data diversity in medical imaging. The newly proposed method called “PrepNet” uses as input, images acquired under different conditions, e.g. different scanner technologies. Additionally, it gets an identifier of these conditions, e.g. source domain. PrepNet is trained to transform each input image into a pre-processed output image of the same medical content for which the image source domain information cannot be identified by a classifier anymore– i.e., the PrepNet homogenizes the appearance of the images acquired under changing conditions while keeping their semantic meaning intact. The method has been evaluated for diagnosing COVID-19 from chest CT images and shows potential for future applications. At the same time, the study raises awareness again for data quality issues in machine learning research: Publicly available COVID-19 datasets can be of low quality, such that developed methods cannot yet unfold their full potential.
The new research was originally funded by ZHAW digital digital futures fund with two projects and subsequently finalized in a Bachelor thesis by Jonathan Gruss and Yves D. Stebler, recent ZHAW computer science alumni. The results have been recently presented at the 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics. Read the full article here: “PrepNet: a convolutional auto-encoder to homogenize CT scans for cross-dataset medical image analysis”.