TAGC 2020
Browse
1/1
3 files

Flexible mixture model approaches that accommodate footprint size variability for robust detection of balancing selection

poster
posted on 2020-04-20, 23:22 authored by Xiaoheng Cheng, Michael DeGiorgio
Long-term balancing selection typically leaves narrow footprints of increased genetic diversity, and therefore most detection approaches only achieve optimal performances when sufficiently small genomic regions (i.e., windows) are examined. Such methods are sensitive to window sizes and suffer substantial losses in power when windows are large. Here, we employ mixture models to construct a set of five composite likelihood ratio test statistics which we collectively term B statistics. These statistics are agnostic to window sizes and can operate on diverse forms of input data. Through simulations, we showed that they exhibit comparable power to the best-performing current methods, and retain substantially high power regardless of window sizes. They also display considerable robustness to high mutation rates and uneven recombination landscapes, as well as an array of other common confounding scenarios. Moreover, we applied B 2 on genomic data of two human populations and recovered many top candidates from prior studies, including the then-uncharacterized STPG2 and CCDC169-SOHLH2, both related to gamete functions. We further applied B 2 on a bonobo population-genomic dataset. In addition to the MHC-DQ and MHC-DP genes, we uncovered several novel candidate genes, such as KLRD1, involved in viral defense, and SCN9A, associated with pain perception. Finally, we show that our methods can be extended to account for multi-allelic balancing selection, and integrated the set of statistics into open-source software named BalLeRMix for future applications by the scientific community.

Funding

Identifying complex modes of adaptation from population-genomic data

National Institute of General Medical Sciences

Find out more...

SG: Inferring phylogenies under ancestral population structure

Directorate for Biological Sciences

Find out more...

SG: Inferring phylogenies under ancestral population structure

Directorate for Biological Sciences

Find out more...

NSF BCS-2001063

History

Program Number

1021A