readings
links: http://yeolab.github.io/onboarding/geo.html
http://www.hildeschjerven.net/Protocols/Submission_of_HighSeq_data_to_GEO.pdf
https://www.ncbi.nlm.nih.gov/geo/info/submissionftp.html
http://www.hildeschjerven.net/Protocols/Submission_of_HighSeq_data_to_GEO.pdf
https://www.ncbi.nlm.nih.gov/geo/info/submissionftp.html
1. create account
Go to NCBI GEO: http://www.ncbi.nlm.nih.gov/geo/ Create User ID and password. my username is
research_guru
I used my google account.
2. fill in the xls sheet
Downloaded the meta xls sheet from https://www.ncbi.nlm.nih.gov/geo/info/seq.html
## bgzip the fastqs
cd 01seq
find *fastq | parallel bgzip
md5sum *fastq.gz > fastq_md5.txt
# copy to excle
cat fastq_md5.txt | awk '{print $2}'
#copy to excle
cat fastq_md5.txt | awk '{print $1}'
cd ../07bigwig
#get the md5sum
md5sum *bw > bigwig_md5.txt
# sample name, copy to excel
cat bigwig_md5.txt | awk '{print $2}'
# md5, copy to excel
cat bigwig_md5.txt | awk '{print $1}'
cd ../08peak_macs1
md5sum *macs1_peaks.bed > peaks_md5.txt
# copy to excel
cat peaks_md5.txt | awk '{print $2}'
cd ..
mkdir research_guru_KMT2D_ChIPseq
cd research_guru_KMT2D_ChIPseq
## fill in the xls sheet and save in this folder
This is the most time-consuming/tedious step.
3. hard link peak and bigwig files to the folder
soft link does not work for me…
ln /rsrch2/genomic_med/krai/hunain_histone_reseq/snakemake_ChIPseq_pipeline_downsample/07bigwig/*bw .
ln /rsrch2/genomic_med/krai/hunain_histone_reseq/snakemake_ChIPseq_pipeline_downsample/08peak_macs1/*macs1_peaks.bed .
4. upload to GEO
# inside the folder: research_guru_KMT2D_ChIPseq
ftp ftp-private.ncbi.nlm.nih.gov
## type in the user name and the password
## this is not your GEO account user name.
## everyone uses the same `geo` and the same password below.
# https://www.ncbi.nlm.nih.gov/geo/info/submissionftp.html
#name: geo
#password: 33%9uyj_fCh?M16H
ftp> prompt n
Interactive mode off.
ftp> cd fasp
# make a folder in the ftp site
ftp> mkdir research_guru_ChIPseq
ftp> cd research_guru_ChIPseq
#upload all the files
ftp> mput *
5. telling NCBI you uploaded stuff
After your transfer is complete, you need to tell the NCBI.
After file transfer is complete, please e-mail GEO with the following information: - GEO account username (tangming2005@gmail.com); - Names of the directory and files deposited; - Public release date (required - up to 3 years from now - see FAQ).
Side notes
for paired-end sequencing data. the xls sheet requires you to fill in the average insert size and the std.
picard CollectInsertSizeMetrics can do this job.time java -jar /scratch/genomic_med/apps/picard/picard-tools-2.13.2/picard.jar CollectInsertSizeMetrics I=4-Mll4-RasG12D-1646-2-cd45_S40_L006.sorted.bam H=4-Mll4-RasG12D-1646-2-cd45_S40_L006_insert.pdf O=4-Mll4-RasG12D-1646-2-cd45_S40_L006_insert.txt
# finish in ~5mins
read http://thegenomefactory.blogspot.com/2013/08/paired-end-read-confusion-library.html for insert size definition.

ReplyDeleteบริการเกมสล็อตออนไลน์ปี 2022 เกมให้เลือกเล่นมากกว่า 1,000 เกมซุปเปอร์สล็อตเล่นผ่านเว็บ f
บริการเกมสล็อตออนไลน์ปี 2022 เกมให้เลือกเล่นมากกว่า 1,000 เกมสล็อตออนไลน์ d
ReplyDelete
ReplyDeleteบริการเกมสล็อตออนไลน์ปี 2022 เกมให้เลือกเล่นมากกว่า 1,000 เกมสล็อตออนไลน์เว็บใหญ่ที่สุด j
Total deposit 10 get 100 wallets latest money making games on mobile 2022 online slots pgslot
ReplyDeleteI truly appreciate the effort you put into creating this detailed and helpful post, it was great to read. Read this article to learn more Coreball Game. The Coreball Game challenges your timing and precision as each level increases in difficulty.
ReplyDelete