[Dock-fans] DOCKing with different architectures?

John J. Irwin jji at cgl.ucsf.edu
Wed Jul 16 11:05:09 PDT 2008


Hi Marshall

Marshall Levesque wrote:
> This may be a general question that applies to all of the DOCK 
> software suite, or only certain parts:
>
> If one were to perform a screening of 100 small compounds (eg from 
> ZINC) using DOCK6 (grid energy and/or AMBER score) and the workload 
> was split between two different architectures (32-bit/64-bit, 
> different compiler versions), are there any issues with using the 
> results ranked by energy score?  For this described situation, 50 
> compounds screened on each machine, same target, same input 
> files/parameters.
>  I'm asking this because if I run the same set of compounds on two 
> different architectures, I get similar results with similar rankings 
> and scores, but sometimes there is the occasional swing in score for 
> some of the compounds (eg -20 --> -8 for grid energy score).  These 
> large changes in score are obviously discomforting, but even the small 
> changes (-20 --> -19) could cause a significant shift in rankings when 
> screen large datasets on the order of 10^5 or 10^6.  
>
> Those most familiar with the DOCK algorithms might know best.  Is the 
> difference in score coming from different architectures something to 
> do with the calculation of the score? or the orientation/confirmation 
> of the compounds by anchor-and-grow?  
>
> I felt that the limited sampling of the search space results in the 
> fact that one can never produce a TRUE score, but more sampling does 
> narrow the window of discrepancy in energy score for the same compound 
> DOCKed on two different architectures, leading me to believe the 
> conformation search is at fault.
>
> Any insight into this would be greatly appreciated, thanks!
We use a different version of DOCK, but I think the conclusions are 
general for the method. It is normal for DOCK to produce very slightly 
different results for identical input on different hardware due to 
accumulation of small rounding errors on floating point numbers. 
Occasionally, the "slight difference" will be at a saddle point during a 
step of minimization, resulting in a different local minimum being 
found, and thus, potentially dramatically different results. But you 
don't have to go to new hardware to see this phenomenon. Just reverse 
the order of molecules in the database, so that minimization of a 
compound starts with a different random seed.

The bottom line? Docking can be useful, but has important weaknesses.  
Make predictions, test them, and go back afterwards and check the 
calculation against the experimental results.

I hope this helps.

John
UCSF DOCK Team


More information about the Dock-fans mailing list