[Dock-fans] Segmentation fault on running docking

Scott Brozell sbrozell at scripps.edu
Fri Nov 9 18:50:58 PST 2007


Hi,

On Thu, 8 Nov 2007, Francesco Pietra wrote:

> On the below re-installation of dock, I added to my .bashrc
> MPICH_HOME=/usr/local
> export MPICH_HOME
> and make a copy of 'mpirun' from /usr/local/bin to /usr/local/dock6/bin
> The test did no more complain about mpirun (attached test8NovC.out)


Yes.   Your attached error was
Processing test mpi
/bin/mpirun -np 2 ../../../bin/dock6.mpi -i mpi.dockin -o mpi.dockmpiout
make[3]: /bin/mpirun: Command not found

MPICH_HOME needs to be defined; this is the command used in the tests:
$(MPICH_HOME)/bin/mpirun


> Rigid docking with the protein deprived of HOH run correctly (attached
> errors_rig_noHOH8Nov.out). However, top -i showed only two nodes involved. A
> test.parallel with Amber9 showed all four nodes involved with sander.
>
> Under these conditions, rigid docking with the protein containing HOH failed
> (attached errors_rig_HOH8Nov.out) because of segmentation fault.


errors_rig_noHOH8Nov.out
Initializing MPI Routines...
Initializing MPI Routines...
Initializing MPI Routines...
Initializing MPI Routines...

is just four lines !  Where are the docking results ?


errors_rig_HOH8Nov.out
Initializing MPI Routines...
Initializing MPI Routines...
Initializing MPI Routines...
Initializing MPI Routines...
[deb64:20728] *** Process received signal ***
[deb64:20728] Signal: Segmentation fault (11)
[deb64:20728] Signal code: Address not mapped (1)
[deb64:20728] Failing at address: 0x2acf65505000
[deb64:20728] [ 0] /lib/libpthread.so.0 [0x2aced8030410]
[deb64:20728] [ 1] dock6.mpi(_ZN6Orient12match_ligandER7DOCKMol+0x40b) [0x447b1b]
[deb64:20728] [ 2] dock6.mpi(main+0xaf5) [0x42cc75]
[deb64:20728] [ 3] /lib/libc.so.6(__libc_start_main+0xda) [0x2aced81554ca]
[deb64:20728] [ 4] dock6.mpi(__gxx_personality_v0+0xc2) [0x41b4ea]


This still looks like an mpi issue.
Is it true that you have run sander.MPI successfully ?


> I forgot to mention that for both the naked protein and the protein embodying a
> single HOH residue, I went on with all the spheres generated by sphgen_cpp. The
> array of spheres generated for the HOH-containing protein (seen from
> "showsphere < sphgen_cpp_cluster.in as pdb file, 74 KB) is huge and dense,
> encompassing the whole protein. This results in  a grid.nrg of 149 MB.
>
> For unclear reasons (because I operated the same way), the array of spheres is
> not so dense for the naked protein, resulting in sphgen_cpp_cluster1.pdb 15KB
> and grid.nrg 80MB.


So there is only one water molecule difference, but the sphere file size
increases by a factor of 5 and the grid by a factor of 2 ?
Something is fishy here.  What does visualization of the sphere files show ?

test8NovC.out  shows possible failures:
# Select spheres within 10 Ang of ligand
../../../bin/sphere_selector struct.sph lig.mol2 3.0
../dockdif -t 3 selected_spheres.sph.save selected_spheres.sph
diffing selected_spheres.sph.save with selected_spheres.sph
possible FAILURE:  check selected_spheres.sph.dif
==============================================================
# Convert selected spheres into pdb format for viewing
../../../bin/showsphere < select.in > /dev/null
../dockdif selected_cluster.pdb.save selected_cluster.pdb
diffing selected_cluster.pdb.save with selected_cluster.pdb
possible FAILURE:  check selected_cluster.pdb.dif
==============================================================
...
Processing test mpi
/usr/local/bin/mpirun -np 2 ../../../bin/dock6.mpi -i mpi.dockin -o mpi.dockmpiout
Initializing MPI Routines...
Initializing MPI Routines...
../dockdif -t 8 mpi.dockmpiout.save mpi.dockmpiout
diffing mpi.dockmpiout.save with mpi.dockmpiout
possible FAILURE:  check mpi.dockmpiout.dif


What are these dif's ?
(Please if the files are small then just cut and paste the contents into
the email rather than attaching them.)


> I wonder whether a grid.nrg of 149MB may pose problems to the docking
> procedure, resulting in the errors described below.


Perhaps, but the behavior as far as I can tell indicates a failure
before reading of the grid.


Scott

On Wed, 7 Nov 2007, Francesco Pietra wrote:

> I have rebuilt the dock program as follows:
> 
> MPICH_HOME=/usr/local
> export MPICH_HOME
> cd install
> (save copy config.h)
> make distclean
> ./configure gnu parallel
> make dock #
> cd test
> make test 2>&1 | tee test7Nov.out
> the out file (attached)
> 
> The last section of the out file (attached) shows that "mpirun" was looked for
> in the wrong place if I correctly understand that it was looked for in
> /usr/local/dock6/bin (actually it is in /usr/local/bin). Is it OK to copy
> "mpirun" from /usr/local/bin to /usr/local/dock6/bin ?
> 
> Or rather adding
> 
> MPICH_HOME=/usr/local
> export MPICH_HOME
> to my .bashrc
> 
> (which only contains
> MPI_HOME=/usr/local
> export MPI_HOME)
> 
> Without doing anything than the above compilation and tests, 
> 
> mpirun -np 4 -i file.in -o file.out 2>&1 | errors.out
> 
> only runs to end for the protein without HOH residue. With the protein
> embodying HOH, the problems set forth below (now attached error ..) arise.
> 
> Probably the docking for the HOH-containing protein should be rerun when mpi is
> fixed, befor assuming that there are problems with the protein.
> 
> I am really sorry for presenting a much confused situation.
> 
> Incidentally, I have carried out successfully test.parallel for Amber 9 (whose
> parallel depends on OpenMPI, like DOCK).
> 
> 
> --- Scott Brozell <sbrozell at scripps.edu> wrote:
> 
> > Hi,
> > 
> > On Wed, 7 Nov 2007, Francesco Pietra wrote:
> > 
> > > Having solved empirically, whitout external help, the problem (posted on
> > last
> > > week-end) of running grid when residue HOH is present within the pore, I am
> > now
> > 
> > 
> > Please post a reply to that thread with the problem resolution.
> > 
> > > faced by segfault in running rigid docking. That occurs both on parallel
> > and
> > > serial run. It also occurs with previous files for the protein without HOH,
> > > where both rigid and flex run OK on mpirun -np 4. Thus, it seems that there
> > is
> > > now something wrong with dock6 and I don't understand what. 
> > > 
> > > Before these unsuccessful attempts, I had
> > > (1) Unsuccessfully tried to recompile Antechamber with new respgen.c (on
> > > Amber9), which did not affect previous compilation.
> > > (2) Carried out "apt-get update" to Debian Linux amd64 etch (i.e., stable)
> > > without taking notice of the little that was affected.
> > > ---------------
> > > 
> > > While running:
> > > 
> > > mpirun -np 4 -i rigid.in -o rigid.out
> > > 
> > > the process halted (rigid_scored.mol2 0 bytes) because
> > > 
> > > Initialing MPI routines ....
> > > [deb64:03540] *** Process received signal ***
> > > [deb64:03540] Signal: Segmentation fault (11)
> > > [deb64:03540] Signal code: Address not mapped (1)
> > > [deb64:03540] Failing at address: 0x2b9ef5691000
> > > dock6.mpi[3540]: segfault at 00002b9ef56910000 rip 0000000000447b1b rsp
> > > 00007fff43c137b0 error 6
> > > [deb64:03540] [0] /lib/libthread.so.0 [0x2b9e681bc410]
> > > [deb64:03540] [1] dock6.mpi (_ZN60rient12match_ligandER7DOCKMol+0x40b)
> > > [0x447b1b]
> > > [deb64:03540] [2] dock6.mpi (main+0xaf5) [0x42cc75]
> > > [deb64:03540] [3] dock6.mpi /lib/libc.so.6(__libc_start_main+0xda)
> > > [0x2b9e682e14ca]
> > > [deb64:03540] [4] dock6.mpi (__gxx_personality_v0+0xc2) [0x41b4ea]
> > > [deb64:03540] *** End of error message ***
> > > mpirun noticed that jpb rank 0 with PID 3537 on node deb64 exited on signal
> > 15
> > > (Terminated).
> > > 3 additional processes aborted (not shown)
> > > ---------------------------
> > > 
> > > I tried also flex with the protein endowed of HOH
> > > 
> > > mpirun -np 4 -i anchor_and_grow.in -o anchor_and_grow.out 2>&1 | tee
> > errors.out
> > > 
> > > with an even more complex of libraries and mpi problems (see please
> > attachment.
> > > ------------------
> > > 
> > > Parallel dock had so far run correctly. DOCK6.1 had been compiled with:
> > > 
> > > ./configure gnu parallel
> > > MPICH_HOME=/usr/local
> > > export MPICH_HOME
> > > make dock
> > > 
> > > I have now unsuccessfully deselected $AMBERHOME in my .bashrc, as expected
> > > because it should be only relevant to running amber_score. I did not change
> > > otherwise my .bashrc, where
> > > 
> > > DOCK_HOME=/usr/local/dock6
> > > PATH=$PATH:$DOCK_HOME/bib; export DOCK_HOME PATH
> > > MPI_HOME=/usr/local
> > > export MPI_home
> > > 
> > > I have now tried the test
> > > cd test/mpi
> > > make 2>&1 | tee test_parallel.out
> > > which passed OK.
> > > 
> > > Also:
> > > which mpicxx
> > > /usr/local/bin/mpicxx
> > > 
> > > Also:
> > > updatedb
> > > locate mpi.h
> > > /usr/include/sc/util/group/memmtmpi.h
> > > /usr/include/sc/util/group/messmpi.h
> > > /usr/dock6/src/dock/base_mpi.h
> > > /usr/local/include/mpi.h
> > > /usr/local/openmpi-1.2.3/ompi/include/mpi.h
> > > /usr/local/openmpi-1.2.3/ompi/include/mpi.h.in
> > > /usr/local/openmpi-1.2.3/ompi/mpi/f77/prototypes_mpi.h
> > > 
> > > -----------------------
> > > Also on serial trial, segfault:
> > > 
> > > dock6 -i rigid.in -o rigid out
> > > 
> > > dock6[3602]: segfault at 00002b4da6e0c000 rip 000000000043ffd1 rsp
> > > 00007fff86593bc0 error 6
> > > Segmentation fault
> > > ---------------------------
> > 
> > It is curious that the dock mpi test is passing, but other dock runs
> > are failing.  However, the failures suggest a machine problem.
> > If you have updated the operating system then it is likely that
> > executables will at least have to be re-linked and maybe re-compiled.
> > If you are a miser then try re-linking.
> > Otherwise rebuild dock:
> > cd install
> > # save any special config.h's
> > make distclean
> > ./configure ...
> > make install
> > # build parallel
> > 






More information about the Dock-fans mailing list