Fernando Pardo-Manuel de Villena

Population Genetics of Inbred Strains

 

The fundamental paradigm of the work at the Center for Genome Dynamics derives from our understanding that the physical and functional organization of the genome is a consequence of its evolution, and that this organization can be deciphered by exploiting the unique evolutionary experiment inbred strains of mice provide (Yang et al. Nature Genetics 2007). Doing so requires that the genomic markers (SNPs) we use for mapping and the functional allelic variation they tag arose in the same branches of the evolutionary tree, that the density of our markers approximates average gene densities, and that we can carry out the requisite genotyping in a cost effective manner. Because the SNPs described in existing databases do not meet these requirements, we have used 109 million genotypes obtained by microarray resequencing of 15 inbred strains, including representatives from each of the three major mouse subspecies to generate two sets of 25,400 phylogenetic trees. Each sets contain the tree for each consecutive 100 kb interval, and the second set is displaced 50 kb with respect to the first. In each interval we determined all the strain distribution patterns (SDPs) represented in the tree and their respective frequencies. This SDP database was then used to select 400,000 SNPs representing each one of the phylogenetic branches observed in the local trees. Computationally, each segment represents a polyallelic system in which we know the time in evolution when the alleles arose, and we have an efficient means of typing these alleles across an extensive sample of inbred strains. The resulting data will 1) provide considerably improved maps of linkage disequilibrium (LD) domains and networks, 2) allow us to investigate the evolutionary forces responsible for the assembly of the LD domains and networks, 3) improve the reliability/resolution of in silico QTL mapping, 4) identify and map historical recombination events and relate these to current maps of recombination hotspots, and 5) address several basic evolutionary questions for which the genus Mus is exceptionally well suited, primary among them being the validity of Wright’s Shifting Balance theory.

Specifically, we are:

  1. Establishing a collection of DNA from a comprehensive set of inbred strains.

  2. Identifying an unbiased set of 400,000 SNPs representing the diversity present among four mouse subspecies, M. m. domesticus, M. m. musculus, M. m. castaneus and M. m. molossinus.

  3. Genotyping these SNPs on 2,000 mouse strains and individual samples.

  4. Generating a genome-wide map of the phylogenetic origin of each genomic region in each strain.