Leonard McMillan & Wei Wang

Computational Tools for Systems Genetics

 

We are focusing on two projects within the Center:

Data Management

We are constructing a database to host the genotyping, transcript, copy-number variation, and methylation data for Center projects, and external users of the Mouse Diversity Genotyping array. This database will serve as a primary data repository warehousing the Center’s experimental results. It will also serve as a portal to provide public access to the results of expensive analysis and mining tasks in the future. In this project we are exploiting underutilized capabilities of modern database management systems such as the active database and temporal database technologies.

Genome Compatibility and Tree-based Association Analysis

We are developing efficient methods for partitioning genomes into parsimonious sets of compatible intervals. Our algorithms aim to find every possible haplotype interval without any apparent recombination or homoplasy on a genome-wide scale. We are identifying a subset of single nucleotide polymorphisms whose removal might reduce the number of intervals necessary to cover all markers. Each genomic region has a unique phylogenetic tree describing relatedness of haplotypes in the region. Given a set of local phylogeny trees spanning the entire genome, we are using these trees as high-level markers for primary phenotype association mapping.