|
CGD SNP Database for Mouse (NCBI build 36)
CGD loads SNPs data from various sources.We have data for 51 different strains of mice.SNPs detected in previous genome assembly builds are
converted to current build.
Ensembl Annotations are used to specify the functional implication of each SNP.
CGD processes the input data by filtering all the inconsistencies before loading data into the database.
CGD does not load the following cases :
- SNPs that are unmapped in the C57BL/6J genome ("N")
- Multiple SNPs that map to the same location(redundant snps) on the same chromosome in the C57BL/6J genome
- SNPs that the provided C57BL/6J allele can't be verified using our C57BL/6J genome of the same build.
- SNPs that the provided C57BL/6J flanking(5' and 3') can't be verified (100%) using our C57BL/6J genome of the same build.
|
| Current Sources |
| Source Name | Initial SNPs count | Source_NCBI_BUILD# | Total Loaded into CGD |
Total Filtered out |
|---|
| Perlegen | 8,272,574 | ncbi_build_36 | 8,249,897 (about 99.73%) |
Total :22,677
|
| Broad | 138,793(138608 build 36 ) | ncbi_build_33 | 138533(about 99.8%) |
260(75)
- Mapping to build 36 failed :185
- Mapped to build 36 but position not verified(58)
- Mapped to build 36 Bad flanking sequence(s) (17)
|
| Broad and Perlegen share a total of 71,508 snps |
|
| SNPs Located in bad snp cluster windows |
Broad : 2,687 Perlegen :?
|
| Data conflict between Sources |
- snp_allele conflict:We say that two snp sources have a snp allele conflict if and only if:
- the snp_allele from source1 != "N"
- the snp_allele from source2 != "N"
- source1_snp_allele != source2_snp_allele
So the following cases are not conflicts: N/A, A/N, T/N,N/T,N/C, C/N, G/N,N/G
Total snps count where Broad and Perlegen do not agree on the snp allele: 126
- genotype allele conflict:
We say that two snp sources have a genotype allele conflict for a given strain if and only if:
- the genotype allele from source1 != "N"
- the genotype allele from source2 != "N"
- source1_genotype_allele != source2_genotype_allele
So the following cases are not conflicts: N/A, A/N, T/N,N/T,N/C, C/N, G/N,N/G
Total snps count where Broad and Perlegen do not agree on the genotype allele for a given strain:
4,841
|
|