[Dock-fans] Need more clarification
William Joseph Allen
william.joseph.allen at gmail.com
Tue Sep 8 14:55:03 PDT 2015
DOCK has a bunch of different scoring functions, and several more were
added to DOCK 6.8 which should be released within a few weeks:
Grid score, which is what I think you used, is a fast, relatively
inexpensive, relatively unbiased, physics-based scoring function. It does a
pretty good job of ranking different molecule poses quickly. The trade off
for speed, in this case, is accuracy. There are other scoring functions
included with DOCK like the Amber score, GBSA score, and footprint score
which are much slower to compute, but there is some evidence and some
publications that demonstrate they are more accurate than grid score. (see
Oe limitation of these "more accurate" scoring functions is that they are
too slow to be used as the primary score in on-the-fly, fully flexible
docking. A molecule that would take 3 minutes to dock with grid score might
take 24 hours to dock with footprint score. Obviously, this is too slow.
One approach we sometimes use is to DOCK molecules with with grid score,
then rescore those molecules with one of the slower-but-more-accurate
scoring methods. In a rescoring calculation, the ligand_atom_file is the
output from your previous grid-score docking run, and orient_ligand and
flexible_ligand are set to "no". This will take the poses you generated
with grid score, then calculate what would be the, e.g., Amber score for
each of those poses. Then they could be re-ranked by the Amber score, and
your "correct" pose may now be the number one pose.
Another approach we have used is to create a scoring function which tries
to balance speed and accuracy. One of those scoring functions is called the
multigrid score. The paper for that is here:
This is sort of a cross between grid score and footprint score. It is
slower than grid score, but faster than footprint score. It is also more
accurate than grid score, but less accurate than footprint score. In some
scenarios, this trade off may be acceptable.
I hope this answers your question,
On Mon, Sep 7, 2015 at 2:35 AM, Amali Guruge <amaligg2010 at gmail.com> wrote:
> Dear Prof.Joe,
> In my docking results, I got the correct ligand pose as the 2nd ranked
> molecule. According to your previous e-mail, I have to use different
> scoring functions to obtain the 2nd ranked molecule as the best ranked
> molecule. Can you please explain this in deatail.
> Thank you.
> On Thu, Sep 11, 2014 at 8:10 PM, William Joseph Allen <
> william.joseph.allen at gmail.com> wrote:
>> Hi Amali,
>> I don't think the problem is linked to saving the .out file into a mol2
>> file with Gaussian, as you say. I think it is more likely that the problem
>> has to do with your DOCK input file parameters. Please see this page:
>> This is our general recommended parameter set for fully flexible docking.
>> Note that some of the parameters used in this set are not the default
>> parameters. In fact, the defaults don't actually perform very well on their
>> own, so we recently updated the DOCK source code to use the parameters
>> above as the defaults. This will be official as soon as DOCK 6.7 is
>> Also, I would not typically optimize the ligand geometry prior to
>> docking, particularly if the current geometry comes directly from a crystal
>> structure. You could probably use the protonated / charged version of the
>> ligand directly from Chimera for docking, and skip Gaussian altogether. The
>> parameter set above should output many possible docked configurations of
>> your ligand, and the main question is whether any predicted pose in that
>> ensemble is the "correct" pose, or within 2 Angstroms RMSD from the
>> crystallographic ligand. There are three possible outcomes:
>> (1) If the best-scoring pose is the "correct" pose, then it was a docking
>> success. We observe this about 73% of the time over our test set of 1,043
>> protein-ligand pairs, with higher success rates for smaller ligands. Now we
>> can usually say the protocol and parameter set we used to achieve that
>> result is suitable for additional ligands or a virtual screen, for example.
>> (2) If the "correct" ligand pose is found in the ensemble of answers, but
>> is not ranked as the best, we call that a scoring failure (19% of the
>> time). In this case, it is left to the user to play around with different
>> scoring functions and options based on what descriptors they think might
>> best describe the fitness of the correct pose.
>> (3) If the "correct" pose is not in the ensemble at all, we call this a
>> sampling failure (8% of the time). In these cases, one can consider
>> increasing some of the parameters including max_orientations, pruning_max_orients,
>> and pruning_clustering_cutoff. This will increase the amount of sampling
>> performed, but also increase the time required to dock.
>> The short answer is: Use the input parameters above. Usually you don't
>> need to optimize ligand geometry. Depending on the outcome (3
>> possibilities), proceed in different ways.
>> Joe Allen
>> On Wed, Sep 10, 2014 at 10:30 PM, Amali Guruge <amaligg2010 at gmail.com>
>>> Dear Joe Allen,
>>> Thank you very much for the reply. I optimized my ligands at
>>> HF/6-31G(d,p) level of theory. Then I saved optimized file (.out file) in
>>> mol2 format using Gaussian software. But my problem is when I do this for a
>>> known ligand and after the docking process it does not give the correct
>>> interactions between ligand and protein. Is this problem linked with the
>>> saving .out file into mol2 file? Protonation and assigning partial charges
>>> were carried out with Chimera. What should I do to overcome this problem.
>>> Please guide me.
>>> Thank you.
>> William Joseph Allen, Ph.D.
>> Postdoctoral Fellow
>> Dept. of Applied Mathematics and Statistics
>> Stony Brook University
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Dock-fans