[Dock-fans] DOCKing with different architectures?

John J. Irwin jji at cgl.ucsf.edu
Wed Jul 16 12:18:27 PDT 2008


Hi Marshall

Marshall Levesque wrote:
> John-
>
> Thank you for the quick and helpful reply. 
>
> I had assumed this was the cause of discrepancies.  So in your 
> opinion, results from a screening experiment that was run on a single 
> machine have equal "validity" when compared to the same screening 
> results obtained using two different architectures? 
Yes, equally "valid", by which I mean, they have no "validity" at all. 
What does it mean to say a docking prediction is valid? It means it 
actually works against the enzyme. Thus, they are predictions that need 
to be tested.

John
UCSF DOCK Team
>
> -Marshall
>
> On Wed, Jul 16, 2008 at 11:05 AM, John J. Irwin <jji at cgl.ucsf.edu 
> <mailto:jji at cgl.ucsf.edu>> wrote:
>
>     Hi Marshall
>
>     Marshall Levesque wrote:
>     > This may be a general question that applies to all of the DOCK
>     > software suite, or only certain parts:
>     >
>     > If one were to perform a screening of 100 small compounds (eg from
>     > ZINC) using DOCK6 (grid energy and/or AMBER score) and the workload
>     > was split between two different architectures (32-bit/64-bit,
>     > different compiler versions), are there any issues with using the
>     > results ranked by energy score?  For this described situation, 50
>     > compounds screened on each machine, same target, same input
>     > files/parameters.
>     >  I'm asking this because if I run the same set of compounds on two
>     > different architectures, I get similar results with similar rankings
>     > and scores, but sometimes there is the occasional swing in score for
>     > some of the compounds (eg -20 --> -8 for grid energy score).  These
>     > large changes in score are obviously discomforting, but even the
>     small
>     > changes (-20 --> -19) could cause a significant shift in
>     rankings when
>     > screen large datasets on the order of 10^5 or 10^6.
>     >
>     > Those most familiar with the DOCK algorithms might know best.
>      Is the
>     > difference in score coming from different architectures something to
>     > do with the calculation of the score? or the
>     orientation/confirmation
>     > of the compounds by anchor-and-grow?
>     >
>     > I felt that the limited sampling of the search space results in the
>     > fact that one can never produce a TRUE score, but more sampling does
>     > narrow the window of discrepancy in energy score for the same
>     compound
>     > DOCKed on two different architectures, leading me to believe the
>     > conformation search is at fault.
>     >
>     > Any insight into this would be greatly appreciated, thanks!
>     We use a different version of DOCK, but I think the conclusions are
>     general for the method. It is normal for DOCK to produce very slightly
>     different results for identical input on different hardware due to
>     accumulation of small rounding errors on floating point numbers.
>     Occasionally, the "slight difference" will be at a saddle point
>     during a
>     step of minimization, resulting in a different local minimum being
>     found, and thus, potentially dramatically different results. But you
>     don't have to go to new hardware to see this phenomenon. Just reverse
>     the order of molecules in the database, so that minimization of a
>     compound starts with a different random seed.
>
>     The bottom line? Docking can be useful, but has important weaknesses.
>     Make predictions, test them, and go back afterwards and check the
>     calculation against the experimental results.
>
>     I hope this helps.
>
>     John
>     UCSF DOCK Team
>     _______________________________________________
>     Dock-fans mailing list
>     Dock-fans at docking.org <mailto:Dock-fans at docking.org>
>     http://blur.compbio.ucsf.edu/mailman/listinfo/dock-fans
>
>


More information about the Dock-fans mailing list