Thursday, August 7, 2014

Understanding the Forward strand and Reverse strand and the coordinates systems

we are on day 4 of the MSU NGS course. In the morning, instructor Istvan introduced Genomic Intervals. To understand the coordinates system, one needs to understand the strandness of DNA.

sense strand is the coding strand
anti-sense strand is the reverse-complementary strand of the coding strand
see details below:
again, everything is on biostar :)

I drew a picture to better understand it

coordinates are reported 5'---> 3'  forward strand
transcription occurs from 5' to 3'
forward/plus strand and reverse/reverse strand are designated arbitrarily.
Imagine that you can flip over the example I drew, then gene A would be in minus strand.

# 0-based and 1-based coordinates system

0 based and 1 based coordinates  cheat sheet

various formats:
GFF3 specification:
0-based formats:BED, wiggle, BEDGRAPH
1-based formats: GFF, GTF, GBK (genebank file), SAM, VCF

# lift over coordinates
lift-over between different versions of genome
Generally do not do it, just map to the right version of interest.
By the way, the latest human genome GRCh38 is released:

