Exploring the silent fitness landscape
At a glance
Since Darwin, natural selection has been recognized as one of major biological forces shaping genetic patterns in molecular data. Detecting selection on proteins has become an indispensible part of genome studies. Remarkably selection can act not only on proteins, but also on synonymous codons translating into the same amino acid. This manifests itself as codon bias, with no influence on the protein sequence, but with potentially strong impact on the protein product and associated cellular processes. In addition, mechanisms such as biased gene conversion may result in an excess of synonymous changes with mild deleterious effect. The role of selection on synonymous changes is often studied by measuring codon usage on the entire gene. This approach however lacks power: it ignores evolutionary information and the impact of site-specific synonymous rate variation, found in >1/3 of proteins. For instance, the use of rare codons at certain sites may slow down translation producing a ribosomal pause for ubiquitin modification or for co-translational protein folding. Codon choice at such sites may affect protein synthesis or product’s properties. Synonymous changes at sites of miRNA or siRNA binding may have impact on protein abundance in a process known as RNA interference (RNAi). Recently single synonymous mutations have been shown to contribute to human diseases such as cancers and diabetes. Such sites often use rare codons or exhibit high synonymous variability. Here we focus on site-specific synonymous codon bias due to selection or biased gene conversion. We develop statistical methods to identify candidate sites in genome-wide scans of species orthologs. A deeper insight into evolutionary dynamics at synonymous sites will come from contrasting fixed differences between species and polymorphisms within populations. To test predictions of the neutral theory about macro- and microevolutionary forces acting on genomes, we develop a statistical framework for analyzing mixed population/species data, thereby bridging the existing methodological gap between molecular evolution and population genetics models.