Diving into Genetics and Genomics: awk for simple text manipulation

Monday, April 22, 2013

awk for simple text manipulation

let's say you have a bed file (tab delimited):

tommy@tommy-ThinkPad-T420:~$ cat file.bed
chr1   100 302
chr2   600 901
chr3   250 383

you want to calculate the average peak length of this file:

tommy@tommy-ThinkPad-T420:~$ cat file.bed | awk '{print $3-$2}'| awk '{sum+=$0} END {print "Average= " sum/NR}'
Average= 212

if you calculate the middle point of the peak, $2+ ($3- $2)/2 and you get a float, you want to round the column:

cat file.bed | awk '{print $2+($3-$2)/2}'

201

750.5

316.5

tommy@tommy-ThinkPad-T420:~$ cat file.bed | awk '{print $2+($3-$2)/2}'| awk 'function round (A) { return int(A+0.5)} { printf("%d\n", round($0))}'

201

751

317

if you want to add an artificial column with peak1, peak2, peak3.....

cat file.bed | awk '{print $1"\t"$2"\t"$3"\t""peak"NR}'

chr1 100 302 peak1

chr2 600 901 peak2

chr3 250 383 peak3

if you want to change a space delimited file to a tab delimited file

cat foo.txt | awk ' -F'' { print $1"\t"$2"\t"$3 } > newfile.txt

you have two files, you want to subtract the third column of file1 (total 3 columns) from the second column of file2(total 3 columns)

paste foo1.txt foo2.txt | awk ' { print $3 -$5}'

you want to cut column 3 from file2, cut column 1 from file 1 and put them together:

paste <(cut -f3 foo2.txt) <(cut -f1 foo1.txt)

4 comments:

UnknownJuly 21, 2015 at 8:18 AM
Spot on!
I used to analyze microarray data with R, and just this year moved to NGS and its UNIX framework. Sometimes it's hard to figure out something as simple as get midpoints from a bed file using awk, but your post pretty much covered it. Nice job.
ReplyDelete
Replies
midnJuly 21, 2020 at 12:42 AM
fausse de montres, combinant un style élégant et une technologie de pointe, une variété de styles de fausse audemars piguet montres, le pointeur marche entre votre style de goût exclusif.
ReplyDelete
Replies
boylanbenJuly 25, 2020 at 8:43 AM
Thanks for sharing nice information about round bed for cat with us. i glad to read this post.
ReplyDelete
Replies

Add comment

Diving into Genetics and Genomics

My github papge

Monday, April 22, 2013

awk for simple text manipulation

4 comments:

Labels

My Blog List