tommy@tommy-ThinkPad-T420:~$ cat file.bed
chr1 100 302
chr2 600 901
chr3 250 383
you want to calculate the average peak length of this file:
tommy@tommy-ThinkPad-T420:~$ cat file.bed | awk '{print $3-$2}'| awk '{sum+=$0} END {print "Average= " sum/NR}'
Average= 212
if you calculate the middle point of the peak, $2+ ($3- $2)/2 and you get a float, you want to round the column:
cat file.bed | awk '{print $2+($3-$2)/2}'
201
750.5
316.5
tommy@tommy-ThinkPad-T420:~$ cat file.bed | awk '{print $2+($3-$2)/2}'| awk 'function round (A) { return int(A+0.5)} { printf("%d\n", round($0))}'
201
751
317
if you want to add an artificial column with peak1, peak2, peak3.....
cat file.bed | awk '{print $1"\t"$2"\t"$3"\t""peak"NR}'
chr1 100 302 peak1
chr2 600 901 peak2
chr3 250 383 peak3
if you want to change a space delimited file to a tab delimited file
cat foo.txt | awk ' -F'' { print $1"\t"$2"\t"$3 } > newfile.txt
you have two files, you want to subtract the third column of file1 (total 3 columns) from the second column of file2(total 3 columns)
paste foo1.txt foo2.txt | awk ' { print $3 -$5}'
you want to cut column 3 from file2, cut column 1 from file 1 and put them together:
paste <(cut -f3 foo2.txt) <(cut -f1 foo1.txt)
Spot on!
ReplyDeleteI used to analyze microarray data with R, and just this year moved to NGS and its UNIX framework. Sometimes it's hard to figure out something as simple as get midpoints from a bed file using awk, but your post pretty much covered it. Nice job.
I am glad that it helps!
Deletefausse de montres, combinant un style élégant et une technologie de pointe, une variété de styles de fausse audemars piguet montres, le pointeur marche entre votre style de goût exclusif.
ReplyDeleteThanks for sharing nice information about round bed for cat with us. i glad to read this post.
ReplyDelete