[CPMD-list] script for PGI-LAMMPI
Axel Kohlmeyer
axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Fri Oct 11 20:56:30 CEST 2002
>>> "MJ" == mjensen <mjensen at fysik.dtu.dk> writes:
MJ> Hi
hello!
MJ> Sorry for interfereing, Carme Rovira is teaching me CPMD,
MJ> and I'm involved in the problem with running CPMD using LAM MPI.
so this ist starting to be a truly international cooperation ;-).
MJ> When running CPMD on two nodes (0 and 1)
MJ> mpirun N -x PP_LIBRARY_PATH cpmd-mpi-lam.x test.inp
MJ> we get this error:
MJ> -----------------------------------------------------------------------------
MJ> One of the processes started by mpirun has exited with a nonzero exit
MJ> code. This typically indicates that the process finished in error.
MJ> If your process did not finish in error, be sure to include a "return
MJ> 0" or "exit(0)" in your C code before exiting the application.
MJ> PID 525 failed on node n1 with exit status 1.
MJ> -----------------------------------------------------------------------------
ok. so the the remote cpmd executable was not willing to run.
MJ> Running on ONLY node 1 or node 0, respectively there's no problem.
aha.
MJ> Running a small mpi send-recieve program in exactely the same way
MJ> is OK that being using either n0 .or. n1 .or. on both (n0,1) a.k.a option N
MJ> to mpirun (the cluster has only single processor per cpu (i.e. node) ).
MJ> Since there's no LAM-MPI option in the Configure script to generate a
MJ> LAM Linux version of CPMD we just used the LAM MPI version to wrap
MJ> the PGI compilers (mpif77 and mpicc), and in the .tcsh we have
PC-PGI-MPI is the configure option you want to use. if this is not
available, you are using an old version of the code and should
get a newer one. otherwise you are missing the crucial defines:
-DPARALLEL -DMP_LIBRARY=__MPI -DMYRINET
in CPPFLAGS. as well as several bugfixes for the x86 architecture
and the pgi compiler.
MJ> setenv PGI /usr/local/lib/PGI
MJ> setenv LAMHOME /usr/local/lib/LAM/
MJ> setenv PATH ${PATH}:/usr/local/lib/LAM/bin
MJ> but had littel luck when running. Any suggestion is highly appreciated.
MJ> Futhermore, is there any experience on compiling CPMD with
MJ> large file support on Linux?
i never tried it so far. and i doubt that it will be of much use,
if you don't have a _huge_ cluster around (and then you will probably
not want to use a tcp/ip based interconnect anyway).
MJ> Adding in this case to a MPICH-PGI CPMD Makefile
MJ> -Mlsf
MJ> as FFLAG
MJ> and
MJ> -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE
MJ> as CFLAG
MJ> resulted in a perfectly clean compilation but
MJ> this problem with the executeable:
please note that you usually should compile _everything_
with largefile support (=LFS), including mpi libraries. because
LFS touches the basic read/write interface that is also used
for communication via pipes or sockets.
MJ> -----------------------------------------------------------------------------
MJ> PARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARA
MJ> LOADPA| PROCESSOR 1 HAS NO G COMPONENT.
MJ> PROGRAM STOPS IN SUBROUTINE LOADPA| TOO MANY PROCESSORS [PROC= 0]
MJ> [0] MPI Abort by user Aborting program !
MJ> [0] Aborting program!
MJ> p0_1356: p4_error: : 999
MJ> Broken pipe
i haven't used MPICH in a long time, since LAM is much more convenient,
of similar and more consistant speed (i had noticeable speed drops for
certain numbers of nodes with MPICH). but 999 is the last string that
cpmd usually prints if it finds an error in the input file or no input
file at all. this could happen, if you did not compile with -DPARALLEL ,
etc.
MJ> -----------------------------------------------------------------------------
MJ> Running the same job with a small ( MPICH CMPD ) exec works fine
do you mean without LFS? or just running on a single node?
MJ> Any way out of this?
MJ> Thanks -
MJ> Morten Jensen
best regards,
axel kohlmeyer.
[snip-snip]
...
--
=======================================================================
Axel Kohlmeyer e-mail: axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Lehrstuhl fuer Theoretische Chemie Phone: ++49 (0)234/32-26673
Ruhr-Universitaet Bochum - NC 03/53 Fax: ++49 (0)234/32-14045
D-44780 Bochum http://www.theochem.ruhr-uni-bochum.de
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.
More information about the CPMD-list
mailing list