[Dockdev] Docking and Scoring algorithms
jji at cgl.ucsf.edu
Wed Feb 23 12:46:49 PST 2005
Hi David cc dockdev at docking.org, dock-fans at docking.org
Thanks for your contribution to the DOCK developers' discussion
group. We welcome all comments and opinions! Arguably this thread fits more
in with the dock-fans mailing list, so I've copied them on this.
I wonder whether there is a misunderstanding about molecular docking
and virtual screening lurking behind what you've written. In our experience,
molecular docking (virtual screening in high throughput) is considered to be
doing well retrospectively if it can
a) enrich known binders 20 fold over random from a database of drug-like
b) reproduce qualitatively the experimental binding geometries (McGovern &
Shoichet, J Med Chem. 2003 Jul 3;46(14):2895-907.)
Prospectively, we consider docking a success if we purchase and test
50 compounds from among the top 500 of a database of purchasable, drug-like
compounds (e.g. ZINC http://zinc.docking.org/ ) and find 3 previously
That's a pretty low bar, but it is considered the state of the art
in this field. If someone shows me a quantitative comparison between docking
energies and experimental binding affinities, unless it is within a narrow
SAR series (and therefore not very interesting), my instinct is to believe
it is an accidental correlation, and that people are fooling themselves into
believing the correlation is significant.
You can list a dozen reasons why docking shouldn't even work, much
less provide good correlations with experimental binding affinities.
Indeed, in our experience, 90+% of top docking hits are not actual binders.
Correlate that! Hardly worth repeating to this audience, the reasons docking
shouldn't work include but are not limited to the approximations of the
scoring function, the inadequate treatment of desolvation and entropy, and
the rigid or incomplete sampling of receptor structure.
We think of docking as a screen, that sorts a database into "more
likely" (top scorers) and "less likely" (the rest) to actually bind
experimentally. Of course, we are actively working to improve docking, and
there is reason to hope that docking can be improved. One way to do this is
to focus on the decoys, and ask what makes molecules score well in the
computer when they do not bind experimentally. This is one area of research
in the lab, and the subject of a paper that will appear shortly from Graves
and Shoichet 2005.
You are right to be cautious, and I encourage you to perform due
diligence on DOCK5 or any other docking program you choose to use. We
certainly do (see McGovern 2003 as above). But I think you also need to have
realistic expectations of docking technology. As you point out, getting free
energy perturbation calculations to correlate with experiment has been
difficult enough. What do you expect with docking calculations that spend a
few seconds or even a few minutes per molecule?
John Irwin http://johnirwin.compbio.ucsf.edu
> It seems to me that there are a multitude of docking
> algorithms out there, all of which have individual quirks
> (Kitchen et. al.), and none of which work perfectly for every
> type of interaction (since simplifying a thermodynamic binding
> potential energy calculation must obviously make assumptions).
> Dr. Kuntz recently wrote his own review of docking
> methodoligies (Brooijmans & Kuntz).
> It seems clear in these reviews that the most challenging
> task of the docking function is to reproduce correct binding
> energies. Even in the MD community, it has been difficult to
> create force-fields that do this task, and P-Chemists are
> working toward quantum corrections to these methods (what
> seems like the opposite direction).
> I have recently tried scoring several receptor-ligand
> complexes (those that worked from the Gold validation set)
> with different scoring functions, and found that the average
> correlation (R^2) between different scoring functions is about
> 0.3, that is 30% similarity. Dock5, as it is installed here,
> however, gave scores with a correlation of about 0.02, right
> around the limit of statistical validity for our dataset (~73
> receptor-ligand pairs).
> I have also tried changing the random number generation
> seed, and found that (with the parameters included in the
> methotrexate example) Dock's energy scores vary by +/- 0.5,
> which (I believe) is acceptable.
> Anyway, I am highly skeptical of Dock5's scoring algorithm,
> and uncertain about publishing any work based upon it until I
> have been able to reproduce a successful screening. This is,
> of course, difficult to do since assembling a list of relevant
> compounds with known binding affinities in the same conditions
> is time-consuming.
> Brooijmans, N. & Kuntz, I. D. (2003) Annu. Rev. Biophys.
> Biomol. Struct. 32, 335-373.
> Kitchen, D. B., Helene, D., Furr, J. R., Bajorath, J. (2004)
> Nature Reviews Drug Discovery. 3, 935
> Gold Validation Set:
> ~ David Rogers
> Graduate Student
> Department of Chemistry
> University of Cincinnati
> Dockdev mailing list
> Dockdev at docking.org
More information about the Dockdev