Exercises in Data Retrieval and Using Blast Searches

Developed by Susan Jean Johns, UCSF Department of Pharmaceutical Chemistry

[The NCBI web site and its parts are updated periodically, therefore
results given below may change with time.]

Step by step instructions with screen shots are given at the end of each exercise.
Instuctions not in the step by step section of the handout are given in green type.

Screen shots were taken from either Safari or Firefox, depending upon which produced the clearer, more easily read image.


Table of Contents:


#1 What is the scope of information available at NCBI on cystic fibrosis in humans?
  Hints: Do a All Databases search at NCBI (http://www.ncbi.nlm.nih.gov). Then repeat the search narrowing the returned hits to human.

  Answer: Lots of data to be explored.

#1 step by step instructions


#2 Besides the cystic fibrosis transmembrane conductance regulator gene (CFTR), what other genes are associated with cystic fibrosis in humans and what is their relationship to the disease?
  Hints: Perform an Entrez Gene search (http://www.ncbi.nlm.nih.gov) to find the other genes and their function or relationship to the disease.

#2 step by step instructions


#3 Nocturnal asthma associated with what gene in humans? What are the RefSeq codes for this gene's mRNA and protein sequences? On the GenBank accession pages, data can be displayed in different formats. What is the difference between default and FASTA formats for these sequence files? How can these RefSeq codes be used to search for similar sequences in other species? What are the results of such a search?
  Hints: Do a Gene search at NCBI (http://www.ncbi.nlm.nih.gov), record the codes. Compare the formats of the mRNA and protein sequences. Run a BLAST search, (http://blast.ncbi.nlm.nih.gov).

#3 step by step instructions


#4 Are there any solved protein crystal structure(s) for the nocturnal asthma gene. Does the structure include the transmembrane segments? Are the found structure protein and the nocturnal asthma protein closely enough related to believe the results?

  Hints: Use the protein accession code from the previous exercise and run a protein BLAST search (http://blast.ncbi.nlm.nih.gov). This time, instead of using the default database, use the swissprotein database and a structure database. Compare the available structure information to make the decision.

#4 step by step instructions


#5 Find proteins that are known to contribute to pulmonary artery hypertension and determine if animal models exist in which the disease can be studied. Can a full length dog protein sequence be found?

best dog match NP_001006646      600 aa

#5 step by step instructions


#6 Are there knockout mice available to study the AGPAT6 gene? How would you order one of these cell lines?

  Hints: Find the mRNA FASTA formatted sequence for the AGPAT6 mouse gene by doing an Entrez Gene search at NCBI (http://www.ncbi.nlm.nih.gov). Then a BLAST search at the International Gene Trap Consortium (IGTC) site (http://www.genetrap.org) to see if such knockouts exist.


#6 step by step instructions


last updated 5/30/2008