I highly recommend this course to everyone. http://bioinformatics.msu.edu/ngs-summer-course-2014
This morning, we learned SNP calling by samtools and sam file specification (I will write another blog for the SNP calling) .in the night , TA Elijah gave an awesome introduction to linux commands.
personally, I think this should be taught in the first day of the course. ( I am already pretty familiar with basic linux commands, but it does cause a lot of frustrations for beginners).
I took the notes, and put the commands that taught in a gist, see below and enjoy linux commands!
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#linux commands basics | |
#http://software-carpentry.org/v5/novice/shell/index.html | |
# practise, practise, practise, google, google, google and you will get it :) | |
pwd # print working directory | |
cd # change directory | |
sudo # super user privilege | |
chmod 775 # change the privileges http://en.wikipedia.org/wiki/Chmod | |
git clone # version control! get to know git and github! http://git-scm.com/ | |
sudo bash # bad habit | |
whoami # who you are | |
exit | |
logout | |
echo # it's like print | |
ls -sh # give the size information in MB etc | |
ls -F # color your folders vs files | |
ls -l # long format | |
ls -a # list everything including invisible files | |
ls /home/tommy/data # list files in the data folder by specifying a full path | |
clear # clear the screen | |
# remember the tab magic auto-fill | |
# clarify the filesystem: slash / denotes the root directory ~ denotes the home directory | |
cd - # change directory to your previous directory | |
cd ~ or just cd # change directory to your home directory | |
cd .. # change directory one level up | |
cd ../.. # change directory to two levels up | |
cp /home/tommy/data.txt . # copy data.txt to the current directory which denoted by . | |
# create things txt editors! | |
nano # use this one, because it is easy | |
emacs # do not use it until later stage | |
vi # 90% of the people do not know how to exit vim :) | |
cat mydata.txt # print the content of mydata.txt to the stand output(the screen) | |
cat mydata1.txt mydata2.txt > total_data.txt # concatenate two files to a new file | |
ctrl + D # quit (indicate cat that this is the last line) when you do something like: cat > mydata.txt | |
head -15 # print the first 15 lines to the screen | |
tail # print the last 10 lines to the screen by default | |
less # use less to peek txt files. less is more | |
#get help | |
man # man cat will show you all the flags (arguments) for cat command | |
info | |
# control + c to quit any command | |
rm # dangerous remove files (permanetly gone) | |
rm -fr #remove directory | |
move # rename files | |
move mydata.txt my_new_data.txt # rename mydata.txt to my_new_data.txt | |
cp # copy files. do not put space in your file names in linux, just remember it. | |
############################# | |
# text manipulation again use man to explore the flags for all the commands below or google it! | |
sed # a book can be dedicated to explain the usage of it http://www.thegeekstuff.com/sed-awk-101-hacks-ebook/ | |
awk # a book can be dedicated to explain the usage of it http://www.thegeekstuff.com/sed-awk-101-hacks-ebook/ | |
cut # one of the most frequently commands I use, cut the columns out of a txt file | |
cut -f1,3 mydata.txt # cut out the first and third columns out and print it to screen | |
paste # paste two files side by side | |
comm # look common lines of two files, files need to be sorted first. | |
comm -12 # will supress the unique lines in each file | |
diff # look different lines of two files. | |
wc # word count | |
wc -l # line number | |
grep ATCG mydata.txt | head # find lines containing ATCG in mydata.txt and look at it by head | |
grep -v ATCG mydata.txt > no_ATCG.txt # find lines not containing ATCG and redirect to a new file named no_ATCG.txt | |
nl # number lines of your output | |
sort -k2,2 -nr # reverse numerical sort based on my second column | |
############################### | |
#up-arrow to reuse your previous command | |
control + r #reverse search your command | |
history # your command history | |
# redirect with >, instead print the output in the command screen, redirect it to a new file | |
head -n 1000 mydata.txt > newdata.txt | |
# redirect the first 1000 lines to a new file named newdata.txt | |
ls . > file_names.txt | |
# redirect all the file names in the current directory to a file named file_names.txt | |
########################## | |
# pipes | where the power comes | |
history | less # look at your history by less, type q to quit less | |
cat mydata.txt | sort -k2,2 -nr > whole_sorted_data.txt # I am a useless cat fan | |
# the upper one equals to | |
sort -k2,2 -nr mydata.txt> whole_sorted_data.txt | |
head -n 1000 mydata.txt | sort -k2,2 -nr > my_selected_sorted_data.txt | |
########################### | |
# use wildcard characters * ? . | |
# again regular expressions! http://regexone.com/ | |
##################### loops | |
for file in *.xml; do head -3 $file; done | |
for file in *.xml; do head -3 $file >> all_head3_xml.txt; done | |
for file in *.xml; do head -3 ${file} > ${file}.txt; done # rename the files by adding txt suffix keeping the old name of the *.xml files. |

linux basics by Tommy Tang is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
No comments:
Post a Comment