Creative Commons License
This blog by Tommy Tang is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

My github papge

Wednesday, August 6, 2014

linux commands basics

I am attending the NGS course at MSU. This is a great course with great instructors and friendly colleagues.
I highly recommend this course to everyone. http://bioinformatics.msu.edu/ngs-summer-course-2014

This morning, we learned SNP calling by samtools and sam file specification (I will write another blog for the SNP calling) .in the night , TA Elijah gave an awesome introduction to linux commands.
personally, I think this should be taught in the first day of the course. ( I am already pretty familiar with basic linux commands, but it does cause a lot of frustrations for beginners).

I took the notes, and put the commands that taught in a gist, see below and enjoy linux commands!

#linux commands basics
#http://software-carpentry.org/v5/novice/shell/index.html
# practise, practise, practise, google, google, google and you will get it :)
pwd # print working directory
cd # change directory
sudo # super user privilege
chmod 775 # change the privileges http://en.wikipedia.org/wiki/Chmod
git clone # version control! get to know git and github! http://git-scm.com/
sudo bash # bad habit
whoami # who you are
exit
logout
echo # it's like print
ls -sh # give the size information in MB etc
ls -F # color your folders vs files
ls -l # long format
ls -a # list everything including invisible files
ls /home/tommy/data # list files in the data folder by specifying a full path
clear # clear the screen
# remember the tab magic auto-fill
# clarify the filesystem: slash / denotes the root directory ~ denotes the home directory
cd - # change directory to your previous directory
cd ~ or just cd # change directory to your home directory
cd .. # change directory one level up
cd ../.. # change directory to two levels up
cp /home/tommy/data.txt . # copy data.txt to the current directory which denoted by .
# create things txt editors!
nano # use this one, because it is easy
emacs # do not use it until later stage
vi # 90% of the people do not know how to exit vim :)
cat mydata.txt # print the content of mydata.txt to the stand output(the screen)
cat mydata1.txt mydata2.txt > total_data.txt # concatenate two files to a new file
ctrl + D # quit (indicate cat that this is the last line) when you do something like: cat > mydata.txt
head -15 # print the first 15 lines to the screen
tail # print the last 10 lines to the screen by default
less # use less to peek txt files. less is more
#get help
man # man cat will show you all the flags (arguments) for cat command
info
# control + c to quit any command
rm # dangerous remove files (permanetly gone)
rm -fr #remove directory
move # rename files
move mydata.txt my_new_data.txt # rename mydata.txt to my_new_data.txt
cp # copy files. do not put space in your file names in linux, just remember it.
#############################
# text manipulation again use man to explore the flags for all the commands below or google it!
sed # a book can be dedicated to explain the usage of it http://www.thegeekstuff.com/sed-awk-101-hacks-ebook/
awk # a book can be dedicated to explain the usage of it http://www.thegeekstuff.com/sed-awk-101-hacks-ebook/
cut # one of the most frequently commands I use, cut the columns out of a txt file
cut -f1,3 mydata.txt # cut out the first and third columns out and print it to screen
paste # paste two files side by side
comm # look common lines of two files, files need to be sorted first.
comm -12 # will supress the unique lines in each file
diff # look different lines of two files.
wc # word count
wc -l # line number
grep ATCG mydata.txt | head # find lines containing ATCG in mydata.txt and look at it by head
grep -v ATCG mydata.txt > no_ATCG.txt # find lines not containing ATCG and redirect to a new file named no_ATCG.txt
nl # number lines of your output
sort -k2,2 -nr # reverse numerical sort based on my second column
###############################
#up-arrow to reuse your previous command
control + r #reverse search your command
history # your command history
# redirect with >, instead print the output in the command screen, redirect it to a new file
head -n 1000 mydata.txt > newdata.txt
# redirect the first 1000 lines to a new file named newdata.txt
ls . > file_names.txt
# redirect all the file names in the current directory to a file named file_names.txt
##########################
# pipes | where the power comes
history | less # look at your history by less, type q to quit less
cat mydata.txt | sort -k2,2 -nr > whole_sorted_data.txt # I am a useless cat fan
# the upper one equals to
sort -k2,2 -nr mydata.txt> whole_sorted_data.txt
head -n 1000 mydata.txt | sort -k2,2 -nr > my_selected_sorted_data.txt
###########################
# use wildcard characters * ? .
# again regular expressions! http://regexone.com/
##################### loops
for file in *.xml; do head -3 $file; done
for file in *.xml; do head -3 $file >> all_head3_xml.txt; done
for file in *.xml; do head -3 ${file} > ${file}.txt; done # rename the files by adding txt suffix keeping the old name of the *.xml files.
Creative Commons License
linux basics by Tommy Tang is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

No comments:

Post a Comment