Applied Computational Genomics

Summary & introduction
The Applied Computational Genomics group focuses on theoretical and computational aspects of modelling the process of genome evolution and adaptive change. With growing size and complexity of molecular data, we strive to keep pace providing accurate, scalable and practical computational solutions that enable a wide range of scientists to analyse patterns of evolution and natural selection in large genomic and omics data.
Our goal is to bring new bioinformatics methods to real applications ranging from biotechnology to biomedical research, ecology and agriculture.
Evolutionary analyses of selective pressures in genomic data have high potential for applications, since natural selection is a leading force in function conservation, in adaptation to emerging pathogens, new environments, and plays key role in immune and resistance systems.
We develop phylogenetic methods for protein-coding sequences that enable to evaluate selective pressure and detect adaptive instances based on genomic signatures. Our computational methods serve to generate new biological hypotheses and predictions for further experimental validation, with the ultimate goal to develop practical applications.
We embrace the interdisciplinary approach by integrating different data sources and combining methods from biological, mathematical, and computer science disciplines.
Our team
Dr. Maria Anisimova, Group Leader, Principal Investigator
Dr. Victor Garcia, Senior Research Associate
Dr. Manuel Gil, Senior Research Associate
Dr. Anna Koroleva, Research Associate / Postdoc
Dr. Massimo Maiolo, Research Associate / Postdoc
Dr. Julija Pecerska, Research Associate / Postdoc
Projects & services
- Fast alignment and phylogeny in frequentist framework
Evolutionary thinking helps to disentangle underlying biological mechanisms shaping molecular data. Genomic sequences of common origin are routinely used to infer phylogenies, which provide test-base for biological hypotheses or support downstream analyses. Based on fast approximation algorithms, we aim to include the alignment uncertainty during phylogeny estimation in the frequentist setting. This will allow for more accurate phylogenetic inferences from vast high-throughput data. - Biosoda - Data Integration in BioSoda, NRP75 Bigdata
This project aims at enabling sophisticated semantic queries across large, decentralized and heterogeneous databases via an intuitive interface. The system will enable scientists, without prior training, to perform powerful joint queries across resources in ways that cannot be anticipated and therefore goes far and above the query functionality of specialized knowledge bases. The project represents an interdisciplinary collaboration between information systems and bioinformatics. - Evolution and function of genomic tandem repeats
We develop statistical phylogenetic methods for analysing tandem repeats in genomic sequences. For example, leucine rich repeats (LRRs) in plant resistance genes provide a source for adaptation to emerging pathogens, so detecting selection on LRRs can bring ideas how to improve crop resistance (Shaper and Anisimova 2014, New Phytol). - Stochastic models for protein-coding genes
We develop methods to study effects of selection on amino acid and codon mutation patterns. These methods can help to identify drug targets and study somatic processes. Our recent antibody model captures the sui generis mechanism specific to somatic hypermutation in maturating antibodies (Mirsky et al 2015, Mol Biol Evol). This provides basis for new bioinformatics methods for antibody analysis necessary for antibody selection and synthesis in the commercial context. - Lighthouse project (Singeria SnSF) - Trans-omic approach to colorectal cancer: an integrative computational and clinical perspective
List of current publications
-
Uwate, Yoko; Schüle, Martin ; Ott, Thomas ; Noshio, Yoshifumi,
2020.
Echo state network with chaos noise for time series prediction [ paper ].
In:
Proceedings of the 2020 International Symposium on Nonlinear Theory and its Applications.
International Symposium on Nonlinear Theory and its Applications (NOLTA), Okinawa, Japan, 16–19 November 2020.
pp. 274.
-
2020.
The collaborative learning cellular automata density classification problem [ paper ].
In:
Proceedings of the 2020 International Symposium on Nonlinear Theory and its Applications.
International Symposium on Nonlinear Theory and its Applications (NOLTA), Okinawa, Japan, 16–19 November 2020.
pp. 268.
-
Maiolo, Massimo ; Ulzega, Simone ; Gil, Manuel ; Anisimova, Maria ,
2020.
Accelerating phylogeny-aware alignment with indel evolution using short time Fourier transform .
NAR Genomics and Bioinformatics.
2(4),
pp. lqaa092.
Available from : https://doi.org/10.1093/nargab/lqaa092
-
Miniussi, Myriam; Ott, Thomas ; Fellermann, Harold,
2020.
Impact of noise and network size in coupled maps with asymmetric influence amplification [ paper ].
In:
Proceedings of the NOLTA 2020 Conference.
2020 International Symposium on Nonlinear Theory and Its Applications (NOLTA2020), Online Conference, 16-19 November 2020.
pp. 282-285.
-
Gygax, Gregory ; Füchslin, Rudolf Marcel ; Ott, Thomas ,
2020.
In:
Proceedings of the NOLTA 2020 Conference.
2020 International Symposium on Nonlinear Theory and Its Applications (NOLTA2020), Online Conference, 16-19 November 2020.
pp. 278-281.
List of current projects
-
Trans-omic approach to colorectal cancer: an integrative computational and clinical perspective
Colorectal Cancer (CRC) is an important cause of cancer-related mortality world-wide. The Consensus Molecular Subtypes represent the first comprehensive molecular classification with clinical implications, but many aspects are still missing. We use a transomic approach to improve the stratification, prognosis, and ...
-
Data mining in neurological medicine
Restless legs syndrome (RLS, Willis-Ekbom disease) is a neurological movement disorder characterised by motor and sensory symptoms, such as the uncontrollable need to move the legs (and sometimes also the arms). Such need is associated with an unpleasant and disturbing sensation in the lower limbs that typically ...
Software tools
TRAL
Tandem Repeat Annotation Library (TRAL, Shaper et al, submitted)
https://github.com/elkeschaper/TandemRepeats/tree/master
CodonPhyML
Maximum likelihood phylogeny for protein-coding genes (Gil et al 2013, Mol Biol Evol)
http://sourceforge.net/projects/codonphyml/
ProGraphMSA
Fast probabilistic graph-based phylogeny-aware alignment with tandem repeats (Szalkowski and Anisimova 2013, Nuc Acids Res)
http://sourceforge.net/projects/prographmsa/