10.6084/m9.figshare.12150537.v1
Uma Arora
Uma
Arora
Caleigh Charlebois
Caleigh
Charlebois
Raman Lawal
Raman
Lawal
Beth Dumont
Beth
Dumont
Evolutionary genomics of centromeric satellites in House Mice (Mus)
TAGC 2020
2020
Centromere
Repetitive DNA
Genetic diversity
Evolutionary Biology
Computational Biology
Genomics
Chromosome segregation
Evolutionary Biology
Genomics
Cell Biology
2020-04-20 21:52:21
Poster
https://tagc2020.figshare.com/articles/poster/Evolutionary_genomics_of_centromeric_satellites_in_House_Mice_Mus_/12150537
<p>
</p><p>Centromeres execute a conserved role in kinetochore assembly and chromosome segregation. Despite their important functional roles, association studies currently ignore megabases of DNA that spans each centromere because their repetitive sequence content makes them refractory to assembly and analysis using short-read sequencing methods. This has left a requirement to define and characterize centromere variation at the population level. To address these critical knowledge gaps, we used data from diverse house mice (genus <i>Mus</i>) to develop a bioinformatic<i> k</i>-mer based strategy using whole genome shotgun read libraries to quantify centromere copy number and sequence variation. We applied this approach to a sample of 33 laboratory mouse strains and 67 wild-caught mice from 9 diverse mouse (<i>Mus</i>) populations and two divergent Mus species (<i>Mus caroli</i> and <i>Mus pahari</i>). Inbred laboratory strains exhibit striking differences in the relative copy number of minor (core centromere) satellite repeats in their genomes. Surprisingly, centromere satellite copy number divergence does not mirror the known phylogenetic relationships between inbred mouse strains. In addition to copy number differences, our analysis uncovers centromere satellite sequence polymorphisms among house mouse strains and subspecies. These differences demonstrate substantial turnover of centromere satellite repeat composition on short evolutionary time scales. Using a de-novo assembly strategy with highly abundant <i>k-mers</i>, we define, for the first time, a centromeric consensus sequence for distant species <i>Mus pahari</i>. Lastly, we uncovered phenotypic associations by correlating chromosomal instability phenotypes with centromeric satellite copy number. These results highlight the power of <i>k</i>-mer based methods for inferring variation in sequence content and structure of repetitive and dynamic genomic regions and provide the first in-depth, phylogenetic portrait of centromere sequence evolution across <i>Mus</i>.</p>