After installing Entrez Direct, I played around with it:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# search pubmed contains "glioblastoma enhancer" | |
$esearch -db pubmed -query "glioblastoma enhancer" | |
<ENTREZ_DIRECT> | |
<Db>pubmed</Db> | |
<WebEnv>NCID_1_539964707_130.14.18.34_9001_1422280320_2091337226_0MetA0_S_MegaStore_F_1</WebEnv> | |
<QueryKey>1</QueryKey> | |
<Count>97</Count> | |
<Step>1</Step> | |
</ENTREZ_DIRECT> | |
# search pubmed with title contains "glioblastoma enhancer" returned 0 count | |
$esearch -db pubmed -query "glioblastoma enhancer [TITL]" | |
<ENTREZ_DIRECT> | |
<Db>pubmed</Db> | |
<WebEnv>NCID_1_23683635_130.14.22.215_9001_1422280849_1465220088_0MetA0_S_MegaStore_F_1</WebEnv> | |
<QueryKey>1</QueryKey> | |
<Count>0</Count> | |
<Step>1</Step> | |
</ENTREZ_DIRECT> | |
#fetch the abstract | |
$esearch -db pubmed -query "glioblastoma enhancer" | efetch -format abstract > glioblastoma.txt | |
#check the abstracts | |
$ less -S glioblastoma.txt | |
# how many papers? | |
$cat glioblastoma.txt | grep PMID | wc -l | |
97 | |
# fetch the protein sequences of human CTCF | |
$esearch -db protein -query "Homo sapiens [ORGN] AND CTCF[GENE]" | efetch -format fasta > CTCF_protein.fa | |
# fetch the nucleotide sequences of human CTCF | |
$esearch -db nucleotide -query "Homo sapiens [ORGN] AND CTCF[GENE]" | efetch -format fasta > CTCF_nucleotide.fa | |
# in genebank format | |
$esearch -db nucleotide -query "Homo sapiens [ORGN] AND CTCF[GENE]" | efetch -format gb > CTCF_nucleotide.gb | |
# From a biostar post https://www.biostars.org/p/92671/ | |
#Given a Gene ID, download the aminoacid sequences of the corresponding Proteins, keeping only the reviewed entries (e.g. no putative, predicted sequences): | |
$esearch -db gene -query "1234[id]" | elink -target protein | efilter -query "REVIEWED[FILTER]"| efetch -format fasta | |
#Given a file containing a list of Gene IDs (one per line), download all the entries in tabular format: | |
$esearch -db gene -query $(paste -s -d ',' mygenes.ids) | efetch -format tabular > mygenes.details.txt | |
Commonly-used fields for PubMed queries include:
[AFFL] Affiliation [FILT] Filter [MESH] MeSH Terms [ALL] All Fields [JOUR] Journal [PTYP] Publication Type [AUTH] Author [LANG] Language [WORD] Text Word [FAUT] Author - First [MAJR] MeSH Major Topic [TITL] Title [LAUT] Author - Last [SUBH] MeSH Subheading [TIAB] Title/Abstract[PDAT] Date - Publication [UID] UID
Filters that limit search results to subsets of PubMed include:
humans [MESH] has abstract [FILT] pharmacokinetics [MESH] historical article [FILT] chemically induced [SUBH] loprovflybase [FILT] all child [FILT] randomized controlled trial [FILT] english [FILT] clinical trial, phase ii [PTYP] free full text [FILT] review [PTYP]
Sequence databases are indexed with a different set of search fields, including:
[ACCN] Accession [GENE] Gene Name [PROT] Protein Name [ALL] All Fields [JOUR] Journal [SQID] SeqID String [AUTH] Author [KYWD] Keyword [SLEN] Sequence Length [GPRJ] BioProject [MLWT] Molecular Weight [SUBS] Substance Name [ECNO] EC/RN Number [ORGN] Organism [WORD] Text Word [FKEY] Feature Key [PACC] Primary Accession [TITL] Title [FILT] Filter [PROP] Properties [UID] UID
and a sample query in the protein database is:
"alcohol dehydrogenase [PROT] NOT (bacteria [ORGN] OR fungi [ORGN])"
Please refer to the documents for more examples http://www.ncbi.nlm.nih.gov/books/NBK179288/
Hello, I am new to using Entrez Direct myself. Is there a filter/restriction I can set, to find all documents listed that have been published in the last 5 days?
ReplyDeleteI am also new to it. According to the manual:
DeleteResults can also be filtered by time. For example, the following statements:
efilter -days 60 -datetype PDAT
efilter -mindate 1990 -maxdate 1999 -datetype PDAT
restrict results to articles published in the previous two months or in the 1990s, respectively.
thanks, I realized that 8 hours ago or so xD I overread it and nobody on the web seems to have written about it
DeleteNo problem! Good luck with your research.
Delete