[Dock-fans] DOCKing with different architectures?
John J. Irwin
jji at cgl.ucsf.edu
Wed Jul 16 11:05:09 PDT 2008
Marshall Levesque wrote:
> This may be a general question that applies to all of the DOCK
> software suite, or only certain parts:
> If one were to perform a screening of 100 small compounds (eg from
> ZINC) using DOCK6 (grid energy and/or AMBER score) and the workload
> was split between two different architectures (32-bit/64-bit,
> different compiler versions), are there any issues with using the
> results ranked by energy score? For this described situation, 50
> compounds screened on each machine, same target, same input
> I'm asking this because if I run the same set of compounds on two
> different architectures, I get similar results with similar rankings
> and scores, but sometimes there is the occasional swing in score for
> some of the compounds (eg -20 --> -8 for grid energy score). These
> large changes in score are obviously discomforting, but even the small
> changes (-20 --> -19) could cause a significant shift in rankings when
> screen large datasets on the order of 10^5 or 10^6.
> Those most familiar with the DOCK algorithms might know best. Is the
> difference in score coming from different architectures something to
> do with the calculation of the score? or the orientation/confirmation
> of the compounds by anchor-and-grow?
> I felt that the limited sampling of the search space results in the
> fact that one can never produce a TRUE score, but more sampling does
> narrow the window of discrepancy in energy score for the same compound
> DOCKed on two different architectures, leading me to believe the
> conformation search is at fault.
> Any insight into this would be greatly appreciated, thanks!
We use a different version of DOCK, but I think the conclusions are
general for the method. It is normal for DOCK to produce very slightly
different results for identical input on different hardware due to
accumulation of small rounding errors on floating point numbers.
Occasionally, the "slight difference" will be at a saddle point during a
step of minimization, resulting in a different local minimum being
found, and thus, potentially dramatically different results. But you
don't have to go to new hardware to see this phenomenon. Just reverse
the order of molecules in the database, so that minimization of a
compound starts with a different random seed.
The bottom line? Docking can be useful, but has important weaknesses.
Make predictions, test them, and go back afterwards and check the
calculation against the experimental results.
I hope this helps.
UCSF DOCK Team
More information about the Dock-fans