SNPtools is an R package that selects SNPs from a set of inbred strains. The SNPs are a combination of the Sanger SNPs and a set of SNPs imputed onto 88 laboratory inbred strains by UNC. The software queries the SNP files and quickly returns SNPs for the requested strains in the region of interest. It then allows the user to plot these SNPs, intersect them with genes and classify them. The SNP files are zipped and indexed using Tabix. The gene locations are derived from the Mouse Genome Informatics gene feature file.


SNP Retrieval and Manipulation (PDF) - Updated Sept 20, 2012
This vignette demonstrates how to use the SNP manipulation functions to re- trieve, subset and plot SNPs from large data sets.


SNPtools R package SNPtools_1.01.tar.gz - Updated Sept 20, 2012

Currently there is an installation issue in which R fails to find the package DESCRIPTION file and terminates the installation. This only occurs when you install from the R GUI. To install on Mac/UNIX or Windows systems, please open a terminal (or command prompt on Windows) and type: R CMD INSTALL SNPTools_1.01.tar.gz

SNP Collections


1. Sanger Mouse Genome SNPs for the 17 sequenced strains
The traditional reference strain, C57BL/6J, was not included in this effort. Rather, C57BL/6NJ was included and it is important to remember that this strain may differ from C57BL/6J. The strains available in the file are: 129P/.OlaHsd 129S1/SvImJ 129S5SvEvBrd A/J AKR/J BALB/cJ C3H/HeJ C57BL/6NJ CAST/EiJ CBA/J DBA/2J LP/J NOD/ShiLtJ NZO/HILtJ PWK/PhJ SPRET/EiJ WSB/EiJ.

Mouse genomic variation and its effect on phenotypes and gene regulation
Keane TM, Goodstadt L, Danecek P, et al.
Nature. 2011 Sep 14;477(7364):289-94. PMCID: PMC3276836. [ Full Text ]

Sequence-based characterization of structural variation in the mouse genome
Yalcin B, Wong K, Agam A, Goodson M, Keane TM, Gan X, Nellåker C, Goodstadt L, Nicod J, Bhomra A, Hernandez-Pliego P, Whitley H, Cleak J, Dutton R, Janowitz D, Mott R, Adams DJ, Flint J.
Nature. 2011 Sep 14;477(7364):326-9.

MGI file from June 6, 2012
MGI.sorted.txt.gz (26 Mb)
MGI.sorted.txt.gz.tbi (803 Kb)


2. Sanger/UNC Imputed SNPs
Leonard McMillan's group at The University of North Carolina (UNC) has used the Sanger SNPs to impute the alleles calls for 88 strains (Imputed Mouse SNP Resource). The Sanger and UNC data have been combined in a single zipped and Tabix indexed file. The allele calls have a confidence score of either 0, 1 or 2 corresponding to low, medium or high confidence in the allele call. Similar to querying the Sanger SNPs alone, SNPs can be subset by genomic region, strains , polymorphic status and quality.

Use of these data should cite the following reference:
Computation of Single-Nucleotide Polymorphisms in Inbred Mice Using Local Phylogeny
Wang JR, Pardo-Manuel de Villena F, Lawson HA, Cheverud JM, Churchill GA, McMillan L.
Genetics. 2012 Feb;190(2):449-58. [ datasets ] [ software ]

Build 37, April 2012
Sanger.UNC.Combined.SNPs.txt.gz (1 Gb)
Sanger.UNC.Combined.SNPs.txt.gz.tbi (2 Mb)