Brian Shoichet¹, John J. Irwin¹, P. Therese Lang¹, Eric Pettersen² and Elaine Meng²
Department of Pharmaceutical Chemistry
University of California, San Francisco
Resource for Biocomputing, Visualization, and Informatics
University of California, San Francisco
Transient non-covalent interactions are critical for biological processes. The sequencing of a variety of genomes and the development of proteomics techniques have enabled scientists to study these interactions on the widest scales . Advances in X-ray crystallography, nuclear magnetic resonance spectroscopy, and other experimental structure techniques provide the ability to study these interactions at an atomic level of detail . One important application of these advances is the design of small molecules that interact with cellular processes to modify biological activity and treat disease.
The drug discovery process typically requires between 10-15 years from early discovery until FDA approval . Computational tools such as virtual screening, homology modeling and cheminformatics are applied both to facilitate various stages of research and to create models that explain experimental data [4-6]. Molecular docking, which can broadly be defined as the prediction of the orientation of two molecules with respect to one another, is a computational technique that has been successfully used in both of these capacities . In drug design applications, one molecule is typically a protein or nucleic acid drug target – the receptor – and the other is a small molecule that will be examined as a putative drug - the ligand. Docking is used to identify novel ligands that interact with a biomolecular target and to predict the geometric position (binding mode) of ligands with respect to the target of interest.
DOCK  is one example of a family of molecular docking packages available, which includes Glide, FlexX, and GOLD [9-11]. Our motivation in developing DOCK is to provide a modular docking package that permits the easy development of new scoring functions, search algorithms, and analysis tools. Thus, each functional unit of the DOCK algorithm was implemented as a self-contained and portable module that interacts with the user through a well-defined interface. The object-oriented language C++ was chosen to allow each component of the DOCK algorithm to be implemented as a class, which encapsulates both the data structures and functions . DOCK 5 incorporates several new routines, including parallelization of the algorithm through an external library, modification of the ligand structural class to enable greater user control over sampling, and clustering of the final results by root mean square deviation. In Dock 6 (released in July 2006), additional scoring functions are implemented including one that communicates smoothly with the AMBER  molecular dynamics package. Going hand in hand with the development of DOCK itself is the development of tools to facilitate the preparation of DOCK input and examination of DOCK output. Collaboration with the RBVI has produced the Dock Prep extension to Chimera for the former need and the ViewDock extension for the latter.
Researchers preparing structures as input to DOCK must typically perform several tasks, including:
The last task, in particular, has constrained DOCK users to using the commercial program SYBYL®, which in turn has severely constrained the use of DOCK in the academic community. The motivation for developing Dock Prep was to streamline the tasks outlined above into a tool that would be freely available to academic researchers.
- Deleting solvent molecules
- Deleting alternate locations of atoms
- Adding hydrogens
- Changing modified residues to the corresponding standard residues for which force field parameters are available (for example, changing selenomethionine to methionine)
- Repairing truncated side chains
- Properly identifying atom types in nonstandard residues such as cofactors and ligands
- Assigning partial charges to atoms
- Saving processed structures in Mol2 format
While deletion of unwanted atoms and addition of hydrogens could already be done separately in Chimera, their integration into the Dock Prep interface along with several new capabilities has significantly simplified structure preparation. The interface allows each task to be turned on or off depending on the situation. Dock Prep uses hydrogen-bond-guided hydrogen placement, which has been demonstrated to improve docking success rates . Truncated amino acid side chains are restored using either a backbone-dependent  or backbone-independent  rotamer library, as specified by the user. Several different AMBER  charge sets are available for standard amino acid and nucleic acid residues. The ability to calculate charges for nonstandard residues and map their Chimera atom types to SYBYL atom types is provided by the program Antechamber , now included with Chimera.
Another feature added primarily for DOCK users is the ability to save molecular surfaces in MS/DMS format. In many applications of DOCK, spheres are used to characterize the "negative volume" of the binding site. The program sphgen, included with DOCK for calculating such spheres, requires an input dot molecular surface in MS/DMS format. Previously, generating such a surface required running a separately obtained, nongraphical program. Now, with Chimera, users can easily display the molecular surface, limit it to the presumed binding site, and write it out in the format required by sphgen.
ViewDock provides a convenient interface for looking at docked ligands in the context of the receptor site and prioritizing them by various criteria. Improvements to ViewDock continue, including tallying the hydrogen bonds between each ligand and the receptor, depicting total and component DOCK scores as histograms, filtering ligands using selections from one or more histograms, and drawing 2D chemical diagrams. Although development of ViewDock has not been as active as Dock Prep, this is still an important area of collaborative development with the RBVI.
- Kopec, K.K., Bozyczko-Coyne, D., and Williams, M., Biochem. Pharmacol., 69 (2005) 1133-1139.
- Congreve, M., Murray, C.W., and Blundell, T.L., Drug Discovery Today, 10 (2005) 895-907.
- Kraljevic, S., Stambrook, P.J., and Pavelic, K., EMBO Rep, 5 (2004) 837-42.
- Schnecke, V. and Bostrom, J., Drug Discovery Today, 11 (2006) 43-50.
- Hillisch, A., Pineda, L.F., and Hilgenfeld, R., Drug Discovery Today, 9 (2004) 659-669.
- Posner, B.A., Curr. Opin. Drug Discovery Dev., 8 (2005) 487-494.
- Alvarez, J.C., Curr Opin Chem Biol, 8 (2004) 365-70.
- Moustakas, D.T., Lang, P.T., Pegg, S., Pettersen, E., Kuntz, I.D., Brooijmans, N., and Rizzo, R.C., J. Comp.-Aided Mol. Design, 20 (2006) 601-19.
- Verdonk, M.L., Cole, J.C., Hartshorn, M.J., Murray, C.W., and Taylor, R.D., Proteins, 52 (2003) 609-623.
- Friesner, R.A., Banks, J.L., Murphy, R.B., Halgren, T.A., Klicic, J.J., Mainz, D.T., Repasky, M.P., Knoll, E.H., Shelley, M., Perry, J.K., Shaw, D.E., Francis, P., and Shenkin, P.S., J. Med. Chem., 47 (2004) 1739- 1749.
- Halgren, T.A., Murphy, R.B., Friesner, R.A., Beard, H.S., Frye, L.L., Pollard, W.T., and Banks, J.L., J. Med. Chem., 47 (2004) 1750-1759.
- Lischner, R. C++ in a nutshell. 1st ed, Sebastopol, CA.: O'Reilly Media, Inc., 2003.
- Case, D.A., Cheatham III, T.E., Darden, T., Gohlke, H., Luo, R., Merz Jr., K.M., Onufriev, A., Simmerling, C., Wang, B., and Woods, R., J. Comp. Chem., 26 (2005) 1668-1688.
- Wang, J., Wang, W., Kollman, P.A., Case, D.A., J. Mol. Graph. Model., 25 (2006) 247-260.
- Dunbrack, R.L. Jr., Curr. Opin. Struct. Biol., 12 (2002) 431-440.
- Lovell, S.C., Word, J.M., Richardson, J.S., Richardson, D.C., Proteins, 40 (2000) 389-408.
Laboratory Overview | Research | Outreach & Training | Available Resources | Visitors Center | Search