DOCK 5.4 User Manual

 

 

 

Irwin D. Kuntz

Demetri T. Moustakas

P. Therese Lang

  

© University of California 2006

Last updated March 2006
 

General Overview

 

Ligand File I/O

 

            Currently, only MOL2 file I/O is supported.  Ligands are read in from a single MOL2 or multi-MOL2 file.  Atom and bond types are assigned using the DOCK 4 atom/bond typing parameter files (vdw.defn, flex.defn, flex_table.defn).  More information about all of these files can be found in the Appendix.  There are several ligand output options, which write molecules to files whose names are formed using the output_file_prefix parameter:

 

DOCK will always write out a scored molecules output file, which contains the best scoring pose for each molecule in the database.  This will create a file called outputprefix_scored.mol2.  Beyond this option, there are several other levels of sampling output:

1)  Users can choose to write out orientations.  This will create a file called outputprefix_orients.mol2.  This will write out the molecules after they have been rigidly oriented and optimized.  If anchor & grow is being used, this option will write out only the anchor fragment.  All orientations generated will be written out, so be careful that the output does not get too huge.

2)  Users can also write out conformers prior to final optimization.  This will create a file called outputprefix_confs.mol2.  Again, be aware that the number of molecules in the output file will be equal to the database size * the # of anchors per molecule * the number of orientations per anchor * the number of conformers per cycle.  This file can grow quite large, so only use it on single poses or small databases.

3) Finally, users can write molecules ranked by score.  This will create a file called outputprefix_ranked.mol2, which writes out the top N molecules from the database.  This option disables the scored molecule output file by default, though users can override this and write out the best pose for each molecule as well.

 

Rigid Orienting

 

            DOCK 5 uses receptor spheres and ligand heavy atom centers to rigidly orient ligands in the receptor.  The spheres are generated using the accessory SPHGEN.  Cliques of receptor spheres & ligand centers are identified using the maximum subgraph clique detection algorithm from DOCK 4.  All cliques that satisfy the matching parameters are generated in the matching step, and can be sorted or ordered prior to the loop where the program cycles through the orientations. 

 

            Both automated and manual matching are available in DOCK5.  The sphere/center matches are determined by 2 parameters:

 

1)     The distance tolerance is the tolerance in angstroms within which a pair of spheres is considered equivalent to a pair of centers

2)     The distance minimum is the shortest distance allowed between 2 spheres (any sphere pair with a shorter distance is disregarded)

 

Manual matching will create as many matches as possible given the specified parameters, and sort the matches according to the RMS error between the spheres and centers in the match.  The matches are provided as orientations until either the max_orients # of orientations are reached, or the end of the match list is reached.

 

            Automated matching will start with the default values for the distance tolerance and distance minimum.  A list of matches will be generated, and if the # of matches is less than the # max_orientations, then the distance tolerance is increased and the matching is repeated until there are at least max_orientations in the match list.  Then the list is sorted, and orientations are generated.

 

Ligand Flexibility

 

            Ligand flexibility in DOCK 5 uses the anchor-and-grow algorithm, which was introduced in DOCK 4.  Rotatable bonds (not contained in rings) are used to partition the molecule into rigid segments, from which all anchors that meet the criteria are selected beginning with the largest anchor segment.  All anchor orientations (or the starting orientation only, if no orienting is selected) are used as starting configurations onto which the first flexible layer is appended and conformationally expanded.  The total population of conformers is then reduced to the number specified by num_confs_per_cycle, and the process is repeated until the last layer is reached.

 

            The conformer generator class now integrates score optimization in the anchor & grow algorithm.  The anchors can be rigidly optimized, the final conformations can be rigidly, torsionally, or completely optimized, and the partially grown conformers can be completely optimized.  The anchor & grow steps use whichever scoring function the user selects as the primary scoring function.  The final minimization step uses the secondary scoring function.

 

Scoring Functions

 

            This release of DOCK5 implements a hierarchical scoring function strategy.  A master score class manages all scoring functions that DOCK uses.  Any of the DOCK scoring functions can be selected as the primary and/or the secondary scoring function.  The primary scoring function is used during the rigid minimization, and anchor & grow steps, which typically make many calls to the scoring function.  The secondary scoring function is used in the final minimization, scoring, and ranking of the molecules.  If no secondary scoring function is selected, the primary scoring function is used as the secondary.

 

            This release contains continuous molecular mechanics based scoring (vdw + columbic terms only), grid-based molecular mechanics scoring, contact scoring and bump filtering as implemented in DOCK 4.  Scoring grids are created using the GRID program.  DOCK also contains GB/SA scoring, as implemented in SDOCK.    Scoring grids for the GBSA code are calculated using the accessories nchemgrid_GB and nchemgrid_SA. 

 

            This release also includes an internal energy scoring function, which is used during the anchor & grow flexible search.  This function computes the Lennard-Jones and columbic energy between all ligand atom pairs, excluding all 1-2, 1-3, and 1-4 pairs.  This energy is not included in the final reported score.

 

Score Optimization

 

            Score optimization is implemented using a simplex minimizer based on the DOCK 4 minimizer.  Users can choose to minimize the rigid anchors, minimize during flexible growth, and minimize the final conformation.  The anchor minimization is always done rigidly; also, if no flexible growth is being done, this step will minimize the entire molecule.  The minimization during the flexible growth is a complete (torsions + rigid) minimization.  The final minimization can be rigid or torsions only, or complete.  When the simplex “shrinks” enough so that the highest and lowest points are within the scoring tolerance or if the number of requested minimizer steps is reached, the minimizer terminates.

 


Using DOCK

 

Installing DOCK

 

1.      Save file for appropriate operating system to hard drive.

2.      Uncompress the archive into a folder called dock5/ in a directory of your choice.

a.      For windows systems, a Zip file is provided

b.      For *nix systems, a gzipped archive is provided

3.      All DOCK 5 binaries are installed in dock5/bin/

 

The dock5 directory contains the following subdirectories:

 

           

bin/

demo/

installation/

parameters/

src/

utilities/

            accessories/

            grid/

                        GBSA_Grids/

 

 

Compiling DOCK (if required)

 

DOCK comes with platform specific compiled binaries.  You should not need to compile the code or accessories unless you have made changes to the source code, or are planning to run DOCK on a platform for which we do not distribute binaries.

 

Building DOCK: (all platforms)

            From the dock5 directory:

            cd config/

            ./configure gnu

            make

 

DOCK with mpi function is built upon an mpi library.  The MPICH library is provided freely by Argonne National Labs (http://www-unix.mcs.anl.gov/mpi/mpich/).  The MPI library needs to be installed and running on the system if the MPI features are to be used.  Once MPI is installed, you need to define MPICH_HOME as an environment variable.

 

Building MPI-DOCK (all platforms):

            From the dock5 directory:

            cd config/

./configure gnu.parallel

make

 

NOTE:  MPI-DOCK 5.4 has been compiled with MPICH-1.2.7 on all supported platforms (MPICH-1.2.5 for WinXP).

 

  

Running DOCK

 

For Windows Users:

DOCK and its accessories must be run using a Linux-like environment like Cygwin (http://www.cygwin.com/).  When you install your emulator, make sure to also install compilers and unix shells (“Devel” for Cygwin).

 

DOCK must be run command line from a standard unix shell.  It reads a parameter file containing field/value pairs using the following command:

 

            dock5/bin/dock5  -i   parameter.in  [-v1]   [-v2] [-o outputfile.txt]

 

If the parameter file does not exist, DOCK will generate one using your responses to the parameter questions.  If the parameter file exists, any parameter values found will be read. 

 

DOCK 5 outputs the job parameters to the screen at the start of the job, and prints summary information for each molecule processed.  Additional summary information will be included in future releases.  The –v1 flag prints a histogram of sphere matching information.  The –v2 flag prints details about the breakdown of the GB/SA terms.

 

Running DOCK in Parallel

 

If you have installed the MPI library, DOCK can be run in parallel using the following command:

 

mpirunnp # dock5.mpi -i parameter.in –o outputfile.txt

 

Note that that parallezation is set up to have a single Master node with the remaining nodes act as slaves.  The Master node performs file processing and input/output, whereas the slaves perform the actual calculations.  If –np = 1, the code defaults to non-MPI behavior.  As a result, there will be minimal difference in performance between 1 and 2 processors.  Improved performance will only become evident with more than 2 nodes.

 

Running the Demo

 

DOCK 5.4 includes two demonstration files that are designed to test your installation.  These demos must also be run command-line.

                       

For DOCK: (all platforms)

            From the dock5 directory:

            cd demo

            ./script_clean

            ./script_demo

 

For MPI-DOCK:  (all platforms)

            From the dock5 directory:

            cd demo

            ./script_clean

            ./script_mpi_demo

NOTE:  MPI-DOCK will be run on 4 processors for the demo

 

DOCK 5 Parameters

 

The parameters for several common calculations have been optimized using test sets.  General recommendations for these parameters can be found in dock5/recommended_input.  Below, all available options for DOCK are described in detail. 

 

The DOCK 5 parameter parser requires that the values entered for a parameter exactly match one of the legal values if any legal values are specified.  For example:

 

param_a                 [5] ():

            param_b                 [5] (0 5 10):

 

param_a can be assigned any value, however param_b can only be assigned 0, 5, or 10.  If no value is entered, both will default to a value of 5.  Below are listed all DOCK 5 parameters, their default values, legal values, and a brief description of each.  The parameters are listed in order of function.  Also, for questions requiring a yes/no answer, please use the full word (yes or no) as opposed to y or n.

 

Ligand I/O Parameters

Parameter Name

Default

Values

Description

ligand_atom_file

database.mol2

string

The ligand input filename

ligand_outfile_prefix

output

string

The prefix that all output files will use

limit_max_ligands

no

bool (yes, no)

The maximum # of ligands that will be read in from a library

write_orientations

no

bool (yes, no)

Flag to write orientations

 

write_conformations

no

yes, no

Flag to write conformations

 

initial_skip

0

int

The # of molecules to skip over at the beginning of a library

calculate_rmsd

no

yes, no

Flag to perform an RMSD calculation between the final molecule pose and its initial structure.

use_rmsd_reference_mol

no

yes, no

Specify reference structure for RMSD calculation (default is starting structure)

 

rmsd_reference_filename

ligand_rmsd.mol2

string

File containing RMSD reference structure

rank_ligands

no

yes, no

Flag to enable a ligand top-score list.  These ligands will be written to outfile_ranked.mol2, and outfile_scored.mol2 will be empty by default

max_ranked_ligands

500

int

The # of ligands to be stored in the top score list

scored_conformer_output_override

no

yes, no

This flag causes all ligands to be written to outfile_scored.mol2, even when rank_ligands is true

num_scored_conformers_written

1

int

The # of scored poses for each ligand printed to output_scored.mol2

cluster_conformations

yes

yes, no

Flag to enable clustering of fully minimized conformations (NOTE: Only available if num_scored_confomers_written > 1)

cluster_rmsd_threshold

2.0

float

The cutoff to determine whether conformations should be clustered

 

Orient Ligand Parameters

Parameter Name

Default

Values

Description

orient_ligand

yes

bool (yes, no)

Flag to orient ligand to spheres

automated_matching

yes

bool (yes, no)

Flag to perform automated matching instead of manual matching

distance_tolerence

0.25

 float

The distance tolerance applied to each edge in a clique

distance_minimum

2.0

 float

The minimum size for an edge in a clique

nodes_minimum

3

 int

The minimum # of nodes in a clique

nodes_maximum

10

 int

The maximum # of nodes in a clique

receptor_site_file

receptor.sph

string

The file containing the receptor spheres

max_orientations

500

 int

The maximum # of orientations that will be cycled through

critical_points

no

bool (yes, no)

Flag to use critical point sphere labeling to target orientations to particular spheres

chemical_matching

no

bool (yes, no)

Flag to use chemical “coloring” of spheres to match chemical labels on ligand atoms

chem_match_tbl

chem_match.tbl

string

File defining the legal chemical type matches/pairings

use_ligand_spheres

no

bool (yes, no)

Flag to enable a sphere file representing ligand heavy atoms to be used to orient the ligand.  Typically used for macromolecular docking

ligand_sphere_file

ligand.sph

string

Ligand spheres

 

 

Flexible Ligand Parameters

Parameter Name

Default

Values

Description

flexible_ligand

yes

bool (yes, no)

Flag to perform ligand conformational searching

ag_conf_search

yes

bool (yes, no)

Flag to use the anchor & grow algorithm to search ligand conformations

min_anchor_size

40

 int

The minimum # of heavy atoms for an anchor segment

num_anchor_orients_for_growth

100

 int

The maximum number of anchor orientations promoted to the conformational search

number_confs_for_next_growth

100

 int

The maximum number of conformations carried forward in the anchor & grow search

use_internal_energy

yes

bool (yes, no)

Flag to add an internal energy term to the score during the conformational search

internal_energy_att_exp

6

int

VDW attractive exponent

 

internal_energy_rep_exp

12

int

VDW repulsive exponent

 

internal_energy_dielectric

4.0

float

Dielectric used for electrostatic calculation

use_clash_overlap            

no

bool (yes, no)

Flag to check for overlapping atom volumes during anchor and grow

clash_overlap

0.5

float

Percent of overlap allowed before a clash is declared

 

Ligand Scoring Parameters

Parameter Name

Default

Values

Description

bump_filter

yes

bool (yes, no)

Flag to perform bump filtering

bump_grid_prefix

grid

string

The prefix to the grid file(s) containing the desired bump grid

max_bumps

2

int

The maximum allowed # of bumps for a molecule to pass the filter

score_molecules

yes

bool (yes, no)

Enables scoring of molecules

contact_score_primary

no

bool (yes, no)

Flag to perform contact scoring as the primary scoring function

contact_score_secondary

no

bool (yes, no)

Flag to perform contact scoring as the secondary scoring function

contact_score_cutoff_distance

4.5

float 

The distance threshold defining a contact

contact_score_clash_overlap

0.75

float

Contact definition for use with intramolecular scoring

contact_score_clash_penalty

50

int

The penalty for each contact overlap made

contact_score_grid_prefix

grid

string

The prefix to the grid files containing the desired cnt grid

grid_score_primary

yes

bool (yes, no)

Flag to perform grid-based energy scoring as the primary scoring function

grid_score_secondary

yes

bool (yes, no)

Flag to perform grid-based energy scoring as the secondary scoring function

grid_score_vdw_scale

1

float

Scalar multiplier of the vdw energy component

grid_score_es_scale

1

float

Scalar multiplier of the electrostatic energy component

grid_score_grid_prefix

grid

string

The prefix to the grid files containing the desired nrg grid

continuous_score_primary

no

bool (yes, no)

Flag to perform continuous non-grid based scoring

continuous_score_secondary

no

bool (yes, no)

Flag to perform continuous non-grid based scoring

cont_score_rec_filename

 

receptor.mol2

string

File that contains receptor coordinates

cont_score_att_exp

 

6

int

VDW L-J potential attractive exponent

cont_score_rep_exp

 

12

int

VDW L-J potential repulsive exponent

cont_score_dielectric

 

4.0

float 

Dielectric constant for electrostatic term

 

cont_score_vdw_scale

 

1

float

Scalar multiplier of vdw energy component

 

cont_score_es_scale

 

1

float

Scalar multiplier of electrostatic energy component

 

gbsa_pairwise_score_primary

no

bool (yes, no)

Toggles whether or not to use GB/SA scoring as the primary scoring function

gbsa_pairwise_score_secondary

no

bool (yes, no)

Toggles whether or not to use GB/SA scoring as the secondary scoring function

gbsa_pairwise_gb_grid_prefix

gb_grid

string

The path to the pairwise GB grids

gbsa_pairwise_sa_grid_prefix

sa_grid

string

The path to the SA grids

 

gbsa_pairwise_vdw_grid_prefix

grid

 string

The path to the dock4 nrg grids, used for the vdw portion of the GB/SA calculation

gbsa_pairwise_screen_file

screen.in

string

GB parameter file for electrostatic screening.  Its located in the parameters dir by default

gbsa_pairwise_solvent_dielectric

78.300003

 float

The value for the solvent dielectric

 

Score Optimization Parameters

Parameter Name

Default

Values

Description

minimize_ligand

yes

bool (yes, no)

Flag to perform score optimization

minimize_anchor

yes

bool (yes, no)

Flag to perform rigid optimization of the anchor

minimize_flexible_growth

yes

bool (yes, no)

Flag to perform complete optimization during conformational search

minimize_final_pose

yes

bool (yes, no)

Flag to perform minimization of the final ligand pose

use_advanced_simplex_parameters

no

bool (yes, no)

Flag to use a simplified set of common minimization parameters for each of the minimization steps listed above

 

Basic Simplex Minimizer Parameters

Parameter Name

Default

Values

Description

simplex_max_cycles

 

1

int

Maximum # of minimization cycles

simplex_score_converge

 

0.1

float

Exit cycle at when energy converges at cutoff

simplex_cycle_converge

 

1.0

float

Exit minimization when cycles converge at cutoff

simplex_trans_step

 

1.0

float

Initial translation step size

 

simplex_rot_step

 

0.1

float

Initial rotation step size

 

simplex_tors_step

 

10.0

float

Initial torsion angle step size

 

simplex_anchor_max_iterations

 

500

int

Maximum # of iterations per cycle per anchor

simplex_grow_max_iterations

 

500

int

Maximum # of iterations per cycle per growth step

simplex_final_max_iterations

 

0

int

Maximum # of iterations per cycle for entire molecule using secondary scoring function

simplex_random_seed

 

0

int

Seed for random number generator

 

Advanced Simplex Minimizer Parameters

Parameter Name

Default

Values

Description

simplex_anchor_max_iterations

50

int 

Maximum # of minimization cycles

simplex_anchor_max_cycles

 

1

int 

Maximum # of minimization cycles

simplex_anchor_score_converge

 

0.1

float

Exit cycle at when energy converges at cutoff

simplex_anchor_cycle_converge

 

1.0

float

Exit minimization when cycles converge at cutoff

simplex_anchor_trans_step

 

1.0

float

Initial translation step size

 

simplex_anchor_rot_step

 

0.1

float

Initial rotation step size

 

simplex_anchor_tors_step

 

10.0

float

Initial torsion angle step size

 

 

NOTE:  Repeat for minimization of each layer (simplex_grow_xxx) and final minimization (simplex_final_xxx)

 

 

 

Atom & Bond Typing Parameters

Parameter Name

Default

Legal

Description

atom_model

all

string (all, united)

Choice of all atom or united atom models

vdw_defn_file

vdw.defn

string

File containing vdw parameters for atom types

flex_defn_file

flex.defn

string

File containing bond definition parameters

flex_drive_file

flex_drive.tbl

string

File containing conformational search parameters

chem._defn_file

chem.defn

string

File containing chemical label (pharmacophore) definitions

 

 


 

ACCESSORIES

 

GRID

Author: Todd Ewing

Based on work by Elaine Meng and Brian Shoichet

 

Overview

GRID creates the grid files necessary for rapid score evaluation in DOCK. Two types of scoring are available: contact and energy scoring. The scoring grids are stored in files ending in *.cnt and *.nrg respectively. When docking, each scoring function is applied independent of the others and the results are written to separate output files.  GRID also computes a bump grid which identifies whether a ligand atom is in severe steric overlap with a receptor atom. The bump grid is identified with a *.bmp file extension. The file containing the bump grid also stores the size, position and grid spacing of all the grids.

The grid calculation must be performed prior to docking. The calculation can take up to 45 minutes, but needs to be done only once for each receptor site. Since DOCK can perform continuum scoring without a grid, the grid calculation is not always required. However, for most docking tasks, such as when multiple binding modes for a molecule or multiple molecules are considered, it will become more time efficient to precompute the scoring grids.

 

Bump Checking

Prior to scoring, each orientation can be processed with the bump filter to reject ones that penetrate deep into the receptor. Orientations that pass the bump filter are then scored and/or minimized with any of the available scoring functions.  A bump is based on the sum of the van der Waals radii of the two interacting atoms. The user specifies what fraction of the sum is considered a bump. For example, the default definition of a bump is if any two atoms approach closer than 0.75 of the sum of their radii.  GRID stores an atomic radius which corresponds to smallest radius of ligand atom at the grid position which would still trigger a bump. During docking, for a given orientation, the position of each atom is checked with the bump grid. If the radius of the atom is greater than or equal to the radius stored in the bump grid, then the atom triggers a bump. To conserve disk space, the atom radius is multiplied by 10 and converted to a short unsigned integer.

 

Energy scoring

The energy scoring component of DOCK is based on the implementation of force field scoring. Force field scores are approximate molecular mechanics interaction energies, consisting of van der Waals and electrostatic components:

 

 

75

where each term is a double sum over ligand atoms i and receptor atoms j, which include the quantities listed below.

 

Generalization of the VDW component

The van der Waals component of the scoring function has been generalized to handle any combination of repulsive and attractive exponents (providing that a> b). The user may choose to "soften" the potential by using a 6-9 Lennard-Jones function. The general form of the van der Waals interaction between two identical atoms is presented:

 

where e is the well depth of the interaction energy, R is the van der Waals radius of the atoms, and coefficients C and D can be determined given the two following boundary conditions:

 

at

at

 

Application of these boundary conditions to the above equation yields an expression of the van der Waals interaction with a generalized Lennard-Jones potential.

 

 

The consequence of using a different exponent for the repulsive term is illustrated in Figure 1. Notice that the well position and depth are unchanged, but that the repulsive barrier has shrunk by about a 0.25 Angstrom.

Figure 1:  Distance dependence of the Lennard-Jones Function

 

Precomputing potentials on a grid

By inspection of the above equations, the repulsion and attraction parameters (Aij and Bij) for the interactions of identical atoms can be derived from the van der Waals radius, R, and the well depth, e.

 

 

In order to evaluate the interaction energy quickly, the van der Waals and electrostatic potentials are precomputed for the receptor and stored on a grid of points containing the docking site. Precomputing the van der Waals potential requires the use of a geometric mean approximation for the A and B terms, as shown:

 

 

Using this approximation, the first equation can be rewritten:

 

 

Three values are stored for every grid point k,each a sum over receptor atoms that are within a user defined cutoff distance of the point:

 

 

These values, with trilinear interpolation, are multiplied by the appropriate ligand values to give the interaction energy. GRID calculates the grid values and stores them in files. The values are read in during a DOCK run and used for force field scoring.

The user determines the location and dimensions of the grid box using the program SHOWBOX. It is not necessary for the whole receptor to be enclosed; only the regions where ligand atoms may be placed need to be included. The box merely delimits the space where grid points are located, and does not cause receptor atoms to be excluded from the calculation. Besides a direct specification of coordinates, there is an option to center the grid at a sphere cluster center of mass. Any combination of spacing and x, y, and z extents may be used.

 

Contact Scoring

Contact scoring in GRID incorporates the scoring performed with the DISTMAP program developed by Shoichet and Bodian. The score is a summation of the heavy atom contacts (every atom except hydrogen) between the ligand and receptor. A contact is defined as an approach of two atoms within some cutoff distance (usually 4.5 Angstroms). If the two atoms approach close enough to bump (as identified with the bump grid) then the interaction can be penalized by an amount specified by the user. The distance dependence of the contact score is represented in Figure 2.

 

Figure 2. Distance dependence of contact score function

 

The attractive score in GRID is negative and a repulsive score is positive.  This switch of sign is necessary to allow the same minimization protocol to be used for contact scoring as implemented for energy scoring.

 

Running GRID

GRID must be run command line from a standard unix shell.  It reads a parameter file containing field/value pairs using the following command:

 

            grid -i grid.in  [-stv] [-o grid.out]

-i input_file (Input parameters extracted from input_file, or grid.in if not specified)

-o output_file (Output written to output_file, or grid.out if not specified)

-s Input parameters entered interactively

-t Reduced output level

-v Increased output level

 

General Parameters

Parameter Name

Default

Value

Description

compute_grids

no

bool (yes, no)

Flag to compute scoring grids

grid_spacing

0.3

float

The distance between grid points along each axis.

 

output_molecule

no

bool (yes, no)

Flag to write out the coordinates of the receptor into a new, cleaned-up file. Atoms are resorted to put all residue atoms together. Terminal SYBYL capping groups are merged with the terminal residues.

 

Scoring Parameters

Parameter Name

Default

Value

Description

contact_score

no

bool (yes, no)

Flag to construct contact grid

contact_cutoff_distance

4.5

float

Maximum distance between heavy atoms for the interaction to be counted as a contact

chemical_score

no

bool (yes, no)

Flag to construct chemical grid

energy_score

no

bool (yes, no)

Flag to perform energy scoring

energy_cutoff_distance

10

float

Maximum distance between two atoms for their contribution to the energy score to be computed

atom_model

u

string (u, a)

Flag for how to model of nonpolar hydrogens

u = United atom model.

Hydrogens attached to

carbons are assigned a

zero VDW well-depth and

the partial charge is

transferred to the carbon.

a = All atom model.

Hydrogens attached to

carbons have regular VDW well-depth and partial charge is not modified.

attractive_exponent

6

int

Exponent of attractive Lennard- Jones term for VDW potential

repulsive_exponent

12

int

Exponent of repulsive Lennard-Jones term for VDW potential

distance_dielectric

yes

bool (yes, no)

Flag to make the dielectric depend linearly on the distance

dielectric_factor

4.0

float

Coefficient of the dielectric

bump_filter

no

bool (yes, no)

Flag to screen each orientation for clashes with receptor prior to scoring and minimizing

bump_overlap

0.75

float

Amount of VDW overlap allowed. If the probe atom and the receptor heavy atom approach closer than this fraction of the sum of their VDW radii, then the position is flagged as a bump.

0 = Complete overlap allowed.

1 = No overlap allowed

 

File Input

Parameter Name

Default

Value

Description

receptor_file

receptor.mol2

bool (yes, no)

Receptor coordinate file. Partial charges and atom types need to be present.

box_file

site_box.pdb

float

File containing SHOWBOX output file which specifies boundaries of grid

vdw_definition_file

vdw.defn

string

VDW parameter file.

 

File Output

Parameter Name

Default

Value

Description

score_grid_prefix

grid

string

Core file name of grids (file extension will be appended automatically)

receptor_out_file

receptor_out.mol2

string

File for cleaned-up receptor when output_molecule set

 

 

NCHEMGRID_GB and NCHEMGRID_SA

Author:  Xiaoqin Zou

 

Overview

The NCHEMGRID_GB and NCHEMGRID_SA programs create the GB and SA receptor grids for use with DOCK 5's GB/SA scoring function.

 

Input

Both programs require that an INCHEM file be created in the working directory, which contains the parameters to control the program.  The INCHEM parameters for both the NCHEMGRID_GB and NCHEMGRID_SA programs are detailed below:

 

For NCHEMGRID_GB:

receptor.pdb                                                   ; receptor pdb file

cavity.pdb                                                        ; cavity pdb file

parameters/prot.table.ambcrg.ambH          ; charge parameter file

parameters/vdw.parms.amb                         ; VDW parameter file

box.pdb                                                           ; box pdb file

0.4                                                                    ; grid spacing in angstroms

2                                                                       ; es type: GB

1                                                                       ; es scale for ff scoring

8.0 8.0                                                             ; cutoff for es and outer box

78.3 78.3                                                         ; dielectric of solvent,cavity

2.3 2.8                                                             ; bumping distances

output_prefix                                                   ; output grid prefix name

1                                                                       ; pairwise calculation

 

NOTE:  The cavity.pdb file should be an empty file.  This feature is not frequently used.  However, the parameter must still be passed.  The pairwise calculation value must also always be 1.

 

For NCHEMGRID_SA:

receptor.pdb                                                   ; receptor pdb file

pararameters/prot.table.ambcrg.ambH        ; charge parameter file

parameters/vdw.parms.amb                          ; VDW parameter file

box.pdb                                                            ; box pdb file

0.4                                                                     ; grid spacing in angstroms

1.4                                                                     ; probe radius for SA

2                                                                        ; scoring type: SA

8.0                                                                     ; cutoff for SA calculations

output_prefix                                                    ; output grid prefix name

 

 

SPHGEN

Author: Irwin D. Kuntz

Modified by: Renee DesJarlais, Brian Shoichet

 

Overview

SPHGEN generates sets of overlapping spheres to describe the shape of a molecule or molecular surface. For receptors, a negative image of the surface invaginations is created; for a ligand, the program creates a positive image of the entire molecule. Spheres are constructed using the molecular surface described by Richards (1977) calculated with the program dms (www.cgl.ucsf.edu). Each sphere touches the molecular surface at two points and has its radius along the surface normal of one of the points. For the receptor, each sphere center is “outside” the surface, and lies in the direction of a surface normal vector. For a ligand, each sphere center is “inside” the surface, and lies in the direction of a reversed surface normal vector. Spheres are calculated over the entire surface, producing approximately one sphere per surface point. This very dense representation is then filtered to keep only the largest sphere associated with each receptor surface atom. The filtered set is then clustered on the basis of radial overlap between the spheres using a single linkage algorithm. This creates a negative image of the receptor surface, where each invagination is characterized by a set of overlapping spheres. These sets, or “clusters,” are sorted according to numbers of constituent spheres, and written out in order of descending size. The largest cluster is typically the ligand binding site of the receptor molecule. The program showsphere writes out sphere center coordinates in PDB format and may be helpful for visualizing the clusters. 

                                  

Input

rec.ms            #molecular surface file

            R                     #sphere outside of surface (R) or inside surface (L)

X                     #specifies subset of surface points to be used (X=all points)

0.0                   #prevents generation of large spheres with close surface contacts (default=0.0)

4.0                   #maximum sphere radius in Angstroms (default=4.0)

1.4                   #minimum sphere radius in Angstroms (default=radius of probe)

rec.sph           #clustered spheres file

NOTES: 

1)  The input file names and parameters are read from a file called INSPH, which should not contain any blank lines or the comments (denoted by #) from above.

2)  The molecular surface file must include surface normals. SPHGEN expects the Fortran format

(A3, I5, X, A4, X, 2F8.3, F9.3, X, A3, 7X, 3F7.3).

 

Output

Some informative messages are written to a file called OUTSPH. This includes the parameters and files used in the calculation. The spheres themselves are written to the clustered spheres file. They are arranged in clusters with the cluster having the largest number of spheres appearing first. The sphere cluster file consists of a header followed by a series of sphere clusters. The header is the line “DOCK 3.5 receptor_spheres” followed by a color table. The color table contains color names (format A30) each on a separate line.  As SPHGEN produces no colors, the color table is simply absent.

The sphere clusters themselves follow, each of which starts with the line

 

cluster n number of spheres in cluster i

 

where n is the cluster number for that sphere cluster, and i is the number of spheres in that cluster.  Next, all spheres in that cluster are listed in the format  (I5, 3F10.5, F8.3, I5, I2, I3)  where the values correspond to, respectively,

 

The clusters are listed in numerical order from largest cluster found to the smallest. At the end of the clusters is cluster number 0. This is not an actual sphere cluster, but a list of allof the spheres generated whose radii were larger than the minimum radius, before the filtering heuristics ( i.e. allowing only one sphere per atom and using a maximum radius cutoff) and clustering were performed. Cluster 0 may be useful as a starting point for users who want to explore a wider range of possible clusters than is provided by the standard SPHGEN clustering routine.  The program creates three temporary files: temp1.ms, temp2.sph, and temp3.atc. These are used internally by SPHGEN, and are deleted upon completion of the computation.  For more information on sphere generation and selection, go to the demo. 

 

 

SHOWBOX

Author: Elaine Meng

 

SHOWBOX is an interactive program that allows visualization of the location and size of the grids that will be calculated by the program grid, using any graphics program that can display PDB format. The user is asked whether the box should be automatically constructed to enclose all of the spheres in a cluster. If so, the user must also enter a value for how closely the box faces may approach a sphere center (how large a “cushion” of space is desired) and the sphere cluster filename and number. If not, the user is asked whether the box will be centered on manually entered coordinates or a sphere cluster center of mass. Depending on the response, the coordinates of the center or the sphere cluster filename and number are requested. Finally, the user must enter the desired box dimensions (if not automatic) and a name for the output PDB-format box file.

 

 

SHOWSPHERE

Authors: Stuart Oatley, Elaine Meng, Daniel Gschwend

 

SHOWSPHERE is an interactive program; it produces a PDB-format file of sphere centers and an MS-like file of sphere surfaces, given the sphere cluster file and cluster number. The surface file generation is optional.  The user may specify one cluster or “all,” and multiple output files will be generated, with the cluster number appended to the end of the name of each file. The input cluster file is created using SPHGEN.  SHOWSPHERE requests the name of the sphere cluster file, the number of the cluster of interest, and names for the output files. Information is sent to the screen while the spheres are being read in, and while the surface points are being calculated.

 

SPHERE_SELECTOR

Author:  P. Therese Lang

SPHERE_SELECTOR will take the ouput from SPHGEN and select all spheres with a user-defined radius of a target molecule.  The target molecule can be anything (ie known ligand, receptor residue, ect) as long as it is in proper MOL2 format.  The required input for sphere_selector is:

USAGE: sphere_selector <sphere_cluster_file.sph> <set_of_atoms.mol2> <radius>

Please note that above order of input files must be maintained for the program to work.

 


APPENDIX 1:  Parameter File Formats

 

Overview

The parameter files contain atom and bond data needed during DOCK calculations. The definition (*.defn) files contain atom and bond labeling data. The table (*.tbl) files contain additional data for chemical interactions and flexible bond torsion positions. They may be edited by the user.

 

Atom Definition Rules

The definition files use a consistent atom labeling convention for which an atom in virtually any chemical environment can be identified. The specification of adjacent atoms is nested using the elements listed below.

 

·        Each element must be separated by a space.

·        If more than one adjacent atom is specified, then ALL must be present (i.e. a boolean AND for rules within a line).

·        If a label can have multiple definition lines, then any ONE of them must be satisfied for inclusion (i.e. a boolean OR for rules on different lines).

 

Atom Definition Elements

Element

Function

atom type

Specifies partial or complete atom type. A partial specification is more general (i.e. "C" versus "C.3"). An asterisk (*) specifies ANY atom type.

 (  )

Specifies atoms that must be bonded to parent atom.

 [  ]

Specifies atoms that must NOT be bonded to parent atom.

 integer

Specifies the number of an atom that must be bonded.

 

Example Definitions

Example

Explanation

C.2 ( 2 O.co2 )

A carboxylate carbon.

.3 [ 3 H ]

Any sp3 hybridized atom that is not attached to three hydrogens.

C. [ O. ] [ N. [ 2 O.2 ] [ 2 C. ] ]

Any carbon not attached to an oxygen or a

nitrogen (unless the nitrogen is a nitro or tertiary nitrogen).

 

vdw.defn

This file contains atom labels and definitions for van der Waals atom typing.

·        The following data types are associated with each atom: VDW radius, VDW well-depth, flag for heavy atom, number of attached atoms.

·        Some labels are used only for the united-atom model, some for only the all-atom model, and some for either.

·        A label may have multiple definitions.

 

Sample Entries from vdw.defn

_____________________________________

name Carbon_sp/sp2

atom_model either

radius 1.850

well_depth 0.120

heavy_flag 1

valence 4

 

definition C

_____________________________________

name Carbon_All_sp3

atom_model all

radius 1.800

well_depth 0.060

heavy_flag 1

valence 4

 

definition C.3

_____________________________________

name Carbon_United_CH3

atom_model united

radius 2.000

well_depth 0.150

heavy_flag 1

valence 4

 

definition C. ( 3 H )

_____________________________________e

 

chem.defn

This file contains labels and definitions for chemical labeling.

·        Nothing in addition to a label is assigned to an atom.

·        A label may have multiple definition lines.

 

Sample Entries from chem.defn

________________________________________________________

name hydrophobic

definition C. [ O. ] [ N. [ 2 O.2 ] [ 2 C. ] ] ( * )

definition N.pl3 ( 3 C. )

definition Cl ( C. )

definition Br ( C. )

definition I ( C. )

definition C.3 [ * ]

________________________________________________________

name donor

definition N. ( H )

definition N.4 [ * ]

________________________________________________________

name acceptor

definition O. [ H ] [ N. ] ( * )

definition O.3 ( 1 * ) [ N. ]

definition O.co2 ( C.2 ( O.co2 ) )

definition N. [ H ] [ N. ] [ O. ] [ 3 . ] ( * )

definition O.2 [ * ]

________________________________________________________

 

chem_match.tbl

This file contains the interaction matrix for which chemical labels can form an interaction in matching.

·        The labels must be identical to labels in chem.defn.

·        The table flag indicates the beginning of the interaction table.

·        Compatible labels are identified with a one, otherwise a zero.

 

Sample of chem_match.tbl

 

label null

label hydrophobic

label donor

label acceptor

label polar

table

1

1          1

1          0          1

1          0          0          1

1          0          1          1          1

nual Parameter Files

flex.defn

This file contains labels and definitions for flexible bond identification.

·        The drive_id field corresponds to a torsion type in the flex_drive.tbl file.

·        The minimize field is a flag for whether the bond may be minimized.

·        Two definition lines must be present. Each definition corresponds to an atom at either end of the bond.

 

Sample Entries from flex.defn

________________________________________

name sp3-sp3

drive_id 3

minimize 1

definition .3 [ 3 H ] [ 3 O.co2 ]

definition .3 [ 3 H ] [ 3 O.co2 ]

________________________________________

name sp3-sp2

drive_id 4

minimize 1

definition .3 [ 3 H ] [ 3 O.co2 ]

definition .2 [ 2 H ] [ 2 O.co2 ]

________________________________________

name sp2-sp2

drive_id 2

minimize 0

definition .2 [ 2 H ] [ 2 O.co2 ]

definition .2 [ 2 H ] [ 2 O.co2 ]

________________________________________

 

flex_drive.tbl

This file contains torsion positions assigned to each rotatable bond when the flexible docking parameter is used in DOCK.

·        The drive_id field corresponds to each torsion type.

·        The positions field specifies the number of torsion angles to sample.

·        The torsions field specifies the angles that are sampled.

 

Sample Entries from flex_drive.defn

_________________________________________

drive_id 2

positions 2

torsions 0 180

_________________________________________

drive_id 3

positions 3

torsions -60 60 180

_________________________________________

drive_id 4

positions 4

torsions -90 0 90 180

_________________________________________

 

APPENDIX 2:  Molecular File Formats

Tripos MOL2 format

 

Overview

This format is used for general molecule input and output of DOCK.  This format has the advantage of storing all the necessary information for atom features, position, and connectivity. It is also a standardized format that other modeling programs can read.

 

Specification

Of the many record types in a MOL2 file, DOCK recognizes the following: MOLECULE, ATOM, BOND, SUBSTRUCTURE and SET. In the MOLECULE record, DOCK utilizes information about the molecule name and number of atoms, bonds, substructures and sets. In the ATOM record DOCK utilizes information about the atom names, types, coordinates, and partial charges. In the BOND record, DOCK utilizes the atom identifiers for the bond. In the SUBSTRUCTURE record, DOCK records the fields, but does not utilize them. The SET records are entirely optional. They are used only in special circumstances, like when the use wants to define the anchor for flexible docking.

Please refer to Sybyl documentation for additional information.

 

Example

This example file illustrates all the elements of the MOL2 file read and written by dock. It includes optional SET records which are used by the ligand flexibility routines.

 

@<TRIPOS>MOLECULE

example

   23    23     1     0     2

GAST_HUCK

 

 

@<TRIPOS>ATOM

      1 C           1.2300    0.7100    0.0000 C.ar      1 ABC        -0.0529

      2 H           2.1620    1.2490    0.0000 H         1 ABC         0.0571

      3 C           0.0000    1.4200    0.0000 C.ar      1 ABC        -0.0242

      4 C           0.0000    2.9060    0.0000 C.3       1 ABC         0.0319

      5 H          -0.5150    3.2700    0.8920 H         1 ABC         0.0598

      6 H          -0.5150    3.2700   -0.8920 H         1 ABC         0.0598

      7 N           1.3680    3.3900    0.0000 N.3       1 ABC        -0.3075

      8 H           1.8480    3.0500    0.8320 H         1 ABC         0.1243

      9 C           1.3680    4.8410    0.0000 C.3       1 ABC        -0.0136

     10 H           0.8530    5.2050    0.8920 H         1 ABC         0.0390

     11 H           0.8530    5.2050   -0.8920 H         1 ABC         0.0390

     12 H           2.3990    5.2050    0.0000 H         1 ABC         0.0390

     13 C          -1.2300    0.7100    0.0000 C.ar      1 ABC        -0.0529

     14 H          -2.1620    1.2490    0.0000 H         1 ABC         0.0571

     15 C          -1.2300   -0.7100    0.0000 C.ar      1 ABC        -0.0471

     16 H          -2.1620   -1.2490    0.0000 H         1 ABC         0.0655

     17 C          -0.0000   -1.4200    0.0000 C.ar      1 ABC         0.0547

     18 C           1.2300   -0.7100    0.0000 C.ar      1 ABC        -0.0471

     19 H           2.1620   -1.2490    0.0000 H         1 ABC         0.0655

     20 C          -0.0000   -2.7800    0.0000 C.2       1 ABC         0.2555

     21 O          -0.0850   -3.5030    0.9750 O.2       1 ABC        -0.3604

     22 O           0.1090   -3.2660   -1.2480 O.3       1 ABC        -0.2944

     23 H           0.1010   -4.2400   -1.1490 H         1 ABC         0.2522

@<TRIPOS>BOND

     1    1    2 1   

     2    1   18 ar  

     3    1    3 ar  

     4    3   13 ar  

     5    3    4 1   

     6    4    6 1   

     7    4    5 1   

     8    4    7 1   

     9    7    8 1   

    10    7    9 1   

    11    9   12 1   

    12    9   11 1   

    13    9   10 1   

    14   13   14 1   

    15   13   15 ar  

    16   15   16 1   

    17   15   17 ar  

    18   17   20 1   

    19   17   18 ar  

    20   18   19 1   

    21   20   21 2   

    22   20   22 1   

    23   22   23 1   

@<TRIPOS>SUBSTRUCTURE

     1 ABC         1 GROUP             0 A     ****    0 ROOT

@<TRIPOS>SET

ANCHOR          STATIC     ATOMS    <user>   **** Anchor atom set

6 1 3 13 15 17 18

RIGID           STATIC     BONDS    <user>   **** Rigid bond set

6 2 3 4 15 17 19