BMI-203 --- Journal References
A PDF copy of Robert May's article in Science about the uses of mathematics in biology can be found
here.
Listed below are journal references that may provide useful background
reading. Many of these references come from a list compiled by Russ Altman
for his MIS-214
course at Stanford. A few of these references are available on-line
(see links below), others are in journals available at the UCSF library,
and some are from more obscure publications and hence may be difficult to
obtain. If you have trouble finding an article you are interested in,
please speak to one of the instructors.
- Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. Basic Local
Alignment Search Tool. J. Mol. Biol., 215, pp. 403-410, 1990.
The paper introducing the BLAST algorithm.
- Arun, K.S., Huang, T.S., and Blostein, S.D. Least-Squares Fitting of Two 3-D
Point Sets. IEEE Transactions on Pattern Analysis and Machine Intelligence,
Vol. PAMI-I 9, No. 5, pp. 698-700, 1987.
A nice paper introducing a method for computing optimal alignment of
two sets of points, given their correspondences.
- Bork, P., Dandekar, T., Diaz-Lazcoz, Y., Eisenhaber, F., Huynen, M., and Yuan,
Y. Predicting Function: From Genes to Genomes and Back. J. Mol. Biol.,
283, pp. 707-725, 1998.
An excellent review of the problem of predicting the function of genes
in the genome age.
- Bowie, J.U., Luthy, R., and Eisenberg, D. A Method to Identify Protein Sequences
that Fold into a Known Three-Dimensional Structure. Science, 253, pp.164-170,
1991.
A classic paper introducing the idea of fold recognition, using
dynamic programming and an environmental description of structure.
- Bryant, S.H. and Altschul, S.F. Statistics of Sequence-Structure Threading.
Current Opinion in Structural Biology, 5, pp. 236-244, 1995.
A good review about how to evaluate protein threading performance.
- Burge, C. and Karlin, S. Prediction of Complete Gene Structures in Human Genomic
DNA. J. Mol. Biol., 268, pp. 78-94, 1997.
An excellent method for finding genes using probabilistic models.
- Dayhoff, M.O., Schwartz, R.M., and Orcutt, B.C. A Model of Evolutionary Change
in Proteins. Atlas of Protein Sequence and Structure, v. 5 suppl. 3,
pp. 345-352, 1978.
A classic paper introducing the PAM matrix.
- Doolittle, R.F., Feng, D-F, Tsang, S., Cho, G, and Little, E. Determining Divergence
Times of the Major Kingdoms of Living Organisms With a Protein Clock. Science,
271, pp. 470-477, 1996.
A classic paper discussing how to correlate divergence in sequence
with chronological time.
- Friedman, R. Lesson 5: Sequence Comparison and Alignment.
Columbia University, 2000.
Lecture notes from Dr. Richard Friedman's class on sequence analysis.
These notes may prove useful as supplementary material for the lecture on
dynamic programming. The notes are available
here.
- Gerstein, M. and Levitt, M. Using Iterative Dynamic Programming to Obtain Accurate
Pairwise and Multiple Alignments of Protein Structures. Proc. of ISMB-96,
pp. 59-67, 1996.
A nice paper introducing a method, based on dynamic programming, for
aligning two 3D structures and finding correspondences between points.
- Gotoh, O. An Improved Algorithm for Matching Biological Sequences. J. Mol.
Biol., 162, pp. 705-708, 1982.
An follow-on paper to Needleman-Wunsch that shows how sequence alignment
can be done in O(N^2).
- Gribskov, M., McLachlan, A.D., and Eisenberg, D. Protein Analysis: Detection
of Distantly Related Proteins. Proc. Natl. Acad. Sci. USA, 84, pp. 4355-4358,
1987.
A classic exposition of the PROFILE method for characterizing protein
sequence multiple alignments.
- Gribskov, M. and Devereux, J., eds. Sequence Analysis Primer, pp. 124-137,
1994.
A very useful introduction to dynamic programming sequence alignment.
- Henikoff, S. and Henikoff, Jorja G. Amino Acid Substitution Matrices from Protein
Blocks. Proc. Natl. Acad. Sci. USA, 89, pp. 10915-10919, 1992.
Introduction of the BLOSUM matrix from the BLOCKS database.
- Karlin, S. and Altschul, S.F. Methods for Assessing the Statistical Significance
of Molecular Sequence Features by Using General Scoring Schemes. Proc.
Natl. Acad. Sci. USA, 87, pp. 2264-2268, 1990.
An exposition of the statistical theory behind the BLAST significance
scores.
- Karp, P.D. and Riley, M. EcoCyc: The Resource and the Lessons Learned. SRI
Report, January 21, 1999.
An introduction to the EcoCYC knowledge base of E. Coli metabolism.
- Koza, J.R. Evolution of a Computer Program for Classifying Protein Segments
asTransmembrane Domains Using Genetic Programming. Proc. of ISMB-94,
pp. 244-252, 1994.
An excellent example of the use of Genetic Programming to solve
a biological problem.
- Krogh, A., Brown, M., Mian, I.S., Sjolander, K., and Haussler, D. Hidden Markov
Models in Computational Biology. J. Mol. Biol., 235, pp. 1501-1531, 1994.
A landmark paper showing the use of Hidden Markov Models as applied
to the globin family.
- Lathrop, R.H. The Protein Threading Problem With Sequence Amino Acid Interaction
Preferences is NP-Complete. Protein Engineering, 7:9, pp. 1059-1068,
1994.
A useful paper showing that protein threading is NP-complete.
- Lawrence, C.E., Altschul, S.F., Boguski, M.S., Liu, J.S., Neuwald, A.F., Wootton,
J.C. Detecting Subtle Sequence Signals: A Gibbs Sampling Strategy for Multiple
Alignment. Science, 262, pp. 208-214, 1993.
A classic paper showing the applicability of Gibbs sampling methods
to multiple sequence alignment.
- McClure, M.A., Vasi, T.K., and Fitch, W.M. Comparative Analysis of Multiple
Protein-Sequence Alignment Methods. Mol. Biol. Evol., 11:4, pp. 571-592,
1994.
A very useful review and comparison of different multiple alignment
strategies, tested on some hard problems.
- Needleman, S.B. and Wunsch, C.D. A General Method Applicable to the Search
for Similarities in Amino Acid Sequence of Two Proteins. J. Mol. Biol.,
48, pp. 443-453, 1970.
The classic introduction of dynamic programming for sequence alignment.
- Nussinov, R. and Wolfson, H.J. Efficient Detection of Three-dimensional Structural
Motifs in Biological Macromolecules by Computer Vision Techniques. Proc.
Natl. Acad. Sci. USA, 88, pp. 10495-10499, 1991.
A paper introducing computer vision techniques for processing 3D structure.
- Orengo C.A. and Taylor, W.R. SSAP: Sequential Structure Alignment Program for
Protein Structure Comparison. Methods in Enzymology, 266, pp. 617-635,
1996.
A nice description of a method for comparing 3D structures and identifying
correspondences.
- Prevelige. P. and Fasman, G.D. Chou-Fasman Prediction of the Secondary Structure
of Proteins: The Chou--Fasman-Prevelige Algorithm. In Prediction of Protein
Structure and the Principles of Protein Conformation (G.D. Fasman, ed) Plenum
Publishing, N.Y. pp. 391-416, 1989.
A description of the classic secondary structure prediction algorithm
by Chou and Fasman.
- Richards, F.M. Calculation of Molecular Volumes and Areas for Structures of
Known Geometry. Methods in Enzymology, 115, pp. 440-464, 1996.
A classic paper on how to compute important 3D properties of volume
and surface for macromolecules.
- Richards, F.M. The Protein Folding Problem. Scientific American, pp.
54-63, January 1991.
A readable summary of the protein folding problem.
- Sippl, M.J. Knowledge-based Potentials for Proteins. Current Opinion in
Structural Biology, 5, pp. 229-235, 1995.
A good description of the use of distance based potentials for protein
threading.
- Smith, T.F. The History of the Genetic Sequence Databases. Genomics,
6, pp. 701-707, 1990.
A very useful, informal history of sequence databases, offering a glimpse
into the high stakes world of biological sequence analysis.
- Smith, T.F. and Waterman, M.S.
Identification of
Common Molecular Subsequences.
J. Mol. Biol., 147, pp. 195-197, 1981.
The classic follow-on to the Needleman & Wunsch paper describing
the use of a similar algorithm for local alignment.
- Subbiah, S., Laurents, D.V., and Levitt, M. Structural Similarity of DNA-binding
Domains of Bacteriophage Repressors and the Globin Core. Current Biology, 3:3,
pp. 141-148, 1993.
A short paper describing a method for detecting 3D protein similarities,
and some unexpected discoveries when this method is applied to the globins.
- Thorne, J.L., Kishino, H, and Felsenstein, J. An Evolutionary Model for Maximum
Likelihood Alignment of DNA Sequences. J. Mol. Evol., 33, pp. 114-124,
1991.
A classic paper on using a model for evolution to compute multiple
sequence alignments.
- Unger, R. and Moult, J. Genetic Algorithms for Protein Folding Simulations.
J. Mol. Biol., 231, pp. 75-81, 1993.
One of the first applications of genetic algorithms to problems in
molecular biology.
- Zuker, M. On Finding All Suboptimal Foldings of an RNA Molecule. Science,
244, pp. 48-52, 1989.
Classic paper on the folding of RNA secondary structure, based on experimental
measurements of interaction energies.