[CPMD-list] parallel cpmd errors 'Null communicator' and 'semget failed'
Axel Kohlmeyer
akohlmey at cmm.chem.upenn.edu
Sat Dec 9 02:15:03 CET 2006
On Thu, 7 Dec 2006, Dan Chipman wrote:
dan,
DC> When I try to run this on a single cpu I get the error message:
DC>
DC> 0 - MPI_COMM_RANK : Null communicator
DC> p0_25827: p4_error: : 197
DC> [0] Aborting program !
DC>
DC> Alternatively, when I try to run this in parallel on several cpus I
DC> get the error message:
DC>
DC> rm_16079: p4_error: semget failed for setnum: 0
DC> p0_25714: (0.523438) net_recv failed for fd = 5
DC> p0_25714: p4_error: net_recv read, errno = : 104
DC> Killed by signal 2.
DC> Broken pipe
can you run other MPI software on that machine?
it looks like there either is an incompatibility of the
(default) settings for SYSV shared memory and the requirements
of MPICH. redhat/fedora default setup is usually _very_
conservative, or you're running out of SYSV semaphores.
you can check with ipcs, if this is the case.
you may want to search the MPICH or CPMD mailing list
archives for hints on how to work around this (this looks
familiar, but it might have been a while...).
generally i'd recommend to use LAM/MPI as it is
very clean, robust and performing more consistently
than MPICH over ethernet. it also does not 'swallow' the
error messages leading up to crashes, as MPICH does, which
makes debugging CPMD input errors in parallel jobs so
difficult with MPICH.
openMPI is slated to be a successor to LAM and
combines many of the advantages of LAM/MPI with
features of other MPI packages, but currently
still tends to leave behind runaway communication
daemons after crashes, that need to be cleaned up.
cheers,
axel.
DC> Can anyone tell me what I am doing wrong? Thanks.
DC> Dan Chipman
DC> _______________________________________________
DC> CPMD-list mailing list
DC> CPMD-list at cpmd.org
DC> http://cpmd.org/mailman/listinfo/cpmd-list
DC>
--
=======================================================================
Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu http://www.cmm.upenn.edu
Center for Molecular Modeling -- University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.
More information about the CPMD-list
mailing list