Docking Review 11/24/08

One purpose of this paper presentation is to explore whether Chimera should be improved to better support docking programs other than DOCK. The paper does discuss the popularity of different docking programs, but otherwise it is about the "science" of docking, a somewhat different topic.

To try to accommodate both, the presentation will be in two chunks:

Review paper:
Sousa SF, Fernandes PA, Ramos MJ. Protein-ligand docking: current status and future challenges. Proteins. 2006 Oct 1;65(1):15-26.

Docking aims to predict the structure of a complex from structures of the separated molecules.

Each part of docking represents a trade-off between thoroughness/accuracy and computational demands:

Search Algorithms (Flexible Ligand Docking)

Scoring Functions

(Protein) Receptor Flexibility

The older concepts of lock-and-key fit (rigid preorganization) and induced fit (receptor adjusts as it binds ligand) have been replaced by a picture where unbound structures form ensembles of conformations, some of which resemble the bound conformation. Ligand binding stabilizes those states and shifts the equilibrium among the different conformations.

Digressing from the 2006 review paper...

In case anyone is interested, here is a good recent review on handling protein flexibility in the context of docking:

Andrusier N, Mashiach E, Nussinov R, Wolfson HJ. Principles of flexible protein-protein docking. Proteins. 2008 Nov 1;73(2):271-89.
(even though it is about protein-protein docking, most of it applies also to the protein side of protein-small molecule docking; covers hinge-finding, normal mode analysis, principal components analysis, sidechain placement algorithms, implementations in docking programs, etc.)

Recent Docking Developments

An exciting recent development in docking is predicting the function of an enzyme by identifying its natural substrates. Functional annotation from structure is a huge and growing area of investigation.

These successful predictions were enabled by identifying the protein of interest as a member of a particular superfamily. This allowed narrowing the possible substrates and types of reactions to a reasonable search space. Different approaches were used to accurately rank the docked molecules, however.

Popularity of Different Methods

It is difficult to compare protein-ligand docking programs. There is even a paper with that title. Nevertheless, many have tried. A couple of recent examples:

Comparative evaluation of eight docking tools for docking and virtual screening accuracy. Kellenberger E, Rodrigo J, Muller P, Rognan D. Proteins. 2004 Nov 1;57(2):225-42.
(compared Dock4, FlexX, Fred, Glide, Gold, Slide, Surflex, QXP and found that Glide, Gold, and Surflex were the most successful)

A critical assessment of docking programs and scoring functions. Warren GL, Andrews CW, Capelli AM, Clarke B, LaLonde J, Lambert MH, Lindvall M, Nevins N, Semus SF, Senger S, Tedesco G, Wall ID, Woolven JM, Peishoff CE, Head MS. J Med Chem. 2006 Oct 5;49(20):5912-31.
(compared 10 docking programs and 37 scoring functions; results varied widely for different receptors, but in general Dock4 performance was poor)
Back to the 2006 review paper... prevalence of use according to literature citations was presented in figures:
Of the top five (above), only AutoDock and DOCK are free to academics.

Chimera Users and Docking

I had saved an old user e-mail exchange on this topic.

I looked through my Chimera citations 2008 file for notes about use of docking programs. The notes were intended to describe how Chimera was used, not necessarily any other programs, so these counts are a lower bound. Also, even though each paper both cited Chimera and described use of a docking program, Chimera wasn't necessarily used to view the docking results.

For comparison, there were 3 uses of DOCK (2 of those probably also used Chimera ViewDock) and 1 MORDOR paper (also mentioned ViewDock) out of 401 papers total listed.

In addition, I recently corresponded with Sebastian Kruggel, who uses both FlexX and Autodock, and Eric recently corresponded with Sergio Marques, who uses Gold. Neither is an author on any of the citations counted above.

More Tidbits

Autodock is free (GNU GPL), but registration is required. There is an associated graphical interface, AutoDock Tools (ADT). AutoDock can allow protein sidechain flexibility. It uses PDBQT format, basically a decorated PDB format similar to PDBPQR. There wasn't a handy example file, just that cut-paste-unfriendly display, but either Chimera reads it and just complains about bad records or it is relatively easy for users to chop off the extra stuff and read it as regular PDB. There are tutorials with sample input and output files supplied as tar.gz.

UCSF DOCK is free for academics, but you have to fill out a licensing form. DOCK 6 can allow receptor flexibility.

Gold is from the Cambridge Crystallographic Data Centre. It can allow partial protein flexibility (backbone and sidechain flexibility of up to 10 residues).

FlexX is from BioSolveIT and can be interfaced with MOE and SYBYL (Tripos).