Creative Commons License
This blog by Tommy Tang is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

My github papge

Tuesday, April 30, 2013

make an average conservation plot based on ChIP-seq data

I asked this question at Seqanswer
http://seqanswers.com/forums/showthread.php?t=29535

save it here for later reference. Thanks



How to make an average conservation plot from ChIP-seq data

Hi all,

I want to know how to make an average conservation plot
http://ceas.cbi.pku.edu.cn/ this website can make something I want, but it can only plot
a set of regions at one time, I want to plot two sets of regions, and compare them.

" 2. GC content and evolutionary conservation of each ChIP-region and their
average. CEAS uses PhastCons conservation scores from UCSC Genome
Bioinformatics, which is based on multiz alignment of human, chimp, mouse,
rat, dog, chicken, fugu, and zebrafish genomic DNA. CEAS generates thumbnail
conservation plot for each ChIP-region and the average conservation plot for
all the ChIP-regions, which can be directly used in ChIP-chip biologists'
manuscript.
"


Any python scripts or bioconductor package can do it?

Thanks
crazyhottommy is online now Report Post  Edit/Delete Message Reply With Quote Multi-Quote This Message Quick reply to this message
Old 04-22-2013, 02:05 AM  #2
Member

Location: Milano, Italy

Join Date: Aug 2011
Posts: 56
Default

Hi,

I encountered the same problem few weeks ago,

It seems that there are not many "expert" regarding PhastCons in this forum.

I explain you what I did,

I took my Chip-seq regions, in bed file format, intersect them with PhastCons element in (mouse example)

Code:
http://hgdownload-test.cse.ucsc.edu/goldenPath/mm9/phastCons30way/

and calculated for each genomic position the score of conservation.
Then averaged all the score.

Consider that you will need to transform a bit the PhastCons file format,

for example:

original format ( just replaced some words with empty)
Code:
chrom=chr1 start=3000306
0.006
0.010
0.014
chrom=chrX start=40000306
0.014
chrom=chr9 start=80000306 
0.1
0.2
processed format
Code:
chr1 3000306 3000307 0.006
chr1 3000307 3000308 0.010
chr1 3000308 3000309 0.019
chrX 40000306 40000307 0.014
chr9 80000306 80000307 0.1
chr9 80000307 80000308 0.2
with the following script:
Code:
awk '/^chrom/{split($1,a,"=");split($2,b,"=");next} { printf "%s\t%10d\t%10d\t%f\n",a[2],b[2],b[2]+1,$1;b[2]++}' filename


Another possibility that I am investigating is using circos
HTML Code:
http://circos.ca/
but You will need to study first how the software works,

Cheers,
Paolo
paolo.kunder is offline Report Post  Reply With Quote Multi-Quote This Message Quick reply to this message

No comments:

Post a Comment