Genome Interval Overlap Calculator Help

Click here to return to the Interval Overlap Calculator

Overview:

This web-application can be used to calculate overlap between two genome interval sets defined by two CSV files.

Parameters:

  • Overlap Header Name to Generate: this is the header name that will be used in the CSV file generated by this application
  • Interval CSV File to Test: one of two interval input files. The file generated by this utility will be the identical to this input file except that a new "overlap" column will be generated which contains a value [0, 1] which determines the amount of overlap between the interval for each row versus the "Interval CSV File to Test Against"
  • Chromosome # Header Name: the chromosome name used in the header row for the respective input file. The values in this column should be an integer in the range [1, 20]
  • Interval Start Header Name: the interval start start position in base pairs. The values in this column should be a positive integer.
  • Interval Extent Header Name: the interval extent in base pairs. The values in this column should be a positive integer.
  • Interval CSV File to Test Against: this is the CSV file that describes the interval set that the intervals from "Interval CSV File to Test" will be tested against. The Header Name parameters for this file are the same as the ones used for the "Interval CSV File to Test".

Sample Data Using Default Header Names

Sample Input CSV File to Test:

# Fake gene data
chromosome,intervalStartingPositionInBasePairs,intervalExtentInBasePairs,geneName
1,100577567,290086,gene1
1,100994187,105062,gene2
1,101233685,140882,gene3
1,101495597,226506,gene4
1,101733859,188982,gene5
1,102175064,136185,gene6
1,102419541,110562,gene7
1,102919875,166182,gene8
1,103167535,164075,gene9
1,103389164,194259,gene10
1,103583740,119354,gene11
1,103718250,151207,gene12
1,104103525,144791,gene13
1,104849925,175896,gene14
1,105598514,135340,gene15
1,105733975,112591,gene16
1,105882249,191449,gene17
        

Sample Input CSV File to Test Against:

# Fake Identical by state regions for C57BL/6J and CE/J
#   Genome Data Used:             Imputed SNP Data Lifted to Build 37 Coordinates (Unvalidated)
#   Minimum Extent in SNPs:       10
#   Minimum Extent in Base Pairs: 100000
chromosome,intervalStartingPositionInBasePairs,intervalExtentInBasePairs
1,100577567,374174
1,100994187,300000
1,101495597,226506
1,101733859,188982
1,101923409,174034
1,102175064,136185
1,102419541,150426
1,102919875,166182
1,103167535,164075
1,103389164,325020
1,103718250,10000
1,104297448,103599
1,104569788,139646
1,104849925,175896
1,105507531,226323
1,105733975,112591
1,105882249,204734
1,106224296,133505
1,107894570,259651
        

Sample Result CSV:

chromosome,intervalStartingPositionInBasePairs,intervalExtentInBasePairs,geneName,intervalOverlap
1,100577567,290086,gene1,1.0
1,100994187,105062,gene2,1.0
1,101233685,140882,gene3,0.4294515977910592
1,101495597,226506,gene4,1.0
1,101733859,188982,gene5,1.0
1,102175064,136185,gene6,1.0
1,102419541,110562,gene7,1.0
1,102919875,166182,gene8,1.0
1,103167535,164075,gene9,1.0
1,103389164,194259,gene10,1.0
1,103583740,119354,gene11,1.0
1,103718250,151207,gene12,0.06613450435495712
1,104103525,144791,gene13,0.0
1,104849925,175896,gene14,1.0
1,105598514,135340,gene15,1.0
1,105733975,112591,gene16,1.0
1,105882249,191449,gene17,1.0