Imputation of rounded zeros for high-dimensional compositional data

; ; ; (). Imputation of rounded zeros for high-dimensional compositional data. Chemometrics and Intelligent Laboratory Systems, 155 183-190. Peer reviewed.

High-dimensional compositional data, multivariate observations carrying rel-
ative information, frequently contain values below a detection limit (rounded zeros). We introduce new model-based procedures for replacing these val-
ues by reasonable numbers, so that the completed data set is ready for use with statistical analysis methods that rely on complete data, such as re-
gression or classification with high-dimensional explanatory variables. The procedures respect the geometry of compositional data and can be consid-
ered as alternatives to existing methods. Simulations show that especially in high-dimensions, the proposed methods outperform existing methods. More-
over, even for a large number of rounded zeros, the new methods lead to an improved quality of the data, which is important for further analyses.
The usefulness of the procedure is demonstrated using a data example from metabolomics.