[Dock-fans] cpu-hours for virtual screening
keshet1 at umbc.edu
Wed Sep 2 08:10:52 PDT 2009
Scott Brozell wrote:
> On Fri, Aug 28, 2009 at 02:39:46PM -0400, Ben Keshet wrote:
>> Scott Brozell wrote:
>>> On Mon, Aug 24, 2009 at 12:39:41PM -0400, Ben Keshet wrote:
>>>> I am preparing for a virtual screening of about 370,000 ligands, and
>>>> trying to estimate the time it would require to complete. In the Dock 5
>>>> paper (J Comput Aided Mol Des (2006) 20:601–619), I saw that the
>>>> average time per ligand was 315+-450 sec. Is that a representative value
>>>> for Dock6.3?
>>> Yes, but it depends a lot on your receptor and ligands.
>>> See for example figure 1 in
>>> which plots floppyness vs time: 0 to 40 rotatable bonds vs
>>> 0 to 9000 sec.
>> Thanks for the reference. The ligands I tested so far and those I am
>> planning to screen have no more than 7 rotatble bonds, so theoretically
>> I should be performing under ~20 min. What other parameters of the
>> receptor could slow down the processing? Could working on Cygwin slow
>> down processing?
> Here are some algorithmic details:
> I dont use cygwin, but
> Here is another situation where a benchmark suite would be useful -
> still on our todo list. If you think this is the problem them you could
> download some standard suite: gooooggggle on cygwin benchmark
> gives several.
I tried flexible docking a tiny subset (11 structures) both with cygwin
and Linux (RHE 4). The input files and the hardware are identical. A
few quick observations:
1) cygwin was nearly 10 times slower (5120+-3250 sec compared to
680+-514, +- is 1 SD).
2) the scores were close but not identical. The average difference was
1.7, with one molecule as much as 7. This led to differences in the
ranking of the ligands.
3) the average RMSD was above 5 angstrom, but excluding 2 ligands which
had an RMSD>10A, the RMSD was 2.9. This number may be misleading in the
case of my receptor, since the binding site is composed of 6 parallel
sequence-identical chains, so in some cases the poses are "transposed"
along the chains while maintaining similar interactions. Moreover, my
binding site is not very tight, so may be more challenging.
4) The number of conformations in linux (as indicated in the .out file)
was always lower than when using cygwin. What could be the reason?
My own conclusion is that time-wise better to run on Linux. I don't
know why the number of conformations was lower on linux, and what are
all the reasons for the differences in scores, poses and therefore
ranking (both installation passed well the installation test).
A slightly unrelated question - the number of orientations for all the
ligands was 1000, however the max_orientations parameter in the input
file is 500. What am I missing?
Thanks for the advises and answers,
>>>> I have docked several ligands against my receptor, but they took much
>>>> longer: ~2000-8000 seconds. The sphere cluster has 103 spheres. Does
>>>> that imply that I am far from the optimal parameter set?
>>> As as far spheres 103 looks normal.
>>> Here are statistics for number of spheres in cluster for the rna paper:
>>> ./1AKX/vac/rec.10A.sph:2:cluster 1 number of spheres in cluster 86
>>> label& trials & min & max & mean & standard_deviation & median & mode
>>> sph & 38 & 54 & 198 & 130.105263157895 & 31.9735001077654 & 123 &
>>> 117, 123 \\\
>>>> Eventually I will be using a cluster of about 50 cpu's. Any time
>>>> estimate that anyone can provide as a general reference would be much
>>> I'd have to dig to determine how much time we used; maybe I'll get to it.
>> Thanks for the info and help,
Chemical & Biochemical Engineering
UMBC (Univ. of Maryland, Baltimore County)
More information about the Dock-fans