[CPMD-list] script for PGI-LAMMPI

Axel Kohlmeyer axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Fri Oct 11 20:56:30 CEST 2002


>>> "MJ" == mjensen  <mjensen at fysik.dtu.dk> writes:

MJ> Hi

hello!

MJ> Sorry for interfereing, Carme Rovira is teaching me CPMD, 
MJ> and I'm involved in the problem with running CPMD using LAM MPI.

so this ist starting to be a truly international cooperation ;-).


MJ> When running CPMD on two nodes (0 and 1) 

MJ> mpirun N -x PP_LIBRARY_PATH cpmd-mpi-lam.x test.inp

MJ> we get this error:

MJ> -----------------------------------------------------------------------------
MJ> One of the processes started by mpirun has exited with a nonzero exit
MJ> code.  This typically indicates that the process finished in error.
MJ> If your process did not finish in error, be sure to include a "return
MJ> 0" or "exit(0)" in your C code before exiting the application.

MJ> PID 525 failed on node n1 with exit status 1.
MJ> -----------------------------------------------------------------------------

ok. so the the remote cpmd executable was not willing to run.


MJ> Running on ONLY node 1 or node 0, respectively there's no problem.

aha.

MJ> Running a small mpi send-recieve program in exactely the same way
MJ> is OK that being using either n0  .or.  n1 .or. on both (n0,1) a.k.a option N
MJ> to mpirun (the cluster has only single processor per cpu (i.e. node) ).


MJ> Since there's no LAM-MPI option in the Configure script to generate a 
MJ> LAM Linux version of CPMD we just used the LAM MPI version to wrap
MJ> the PGI compilers (mpif77 and mpicc), and in the .tcsh we have

PC-PGI-MPI is the configure option you want to use. if this is not
available, you are using an old version of the code and should
get a newer one. otherwise you are missing the crucial defines:
  -DPARALLEL -DMP_LIBRARY=__MPI -DMYRINET
in CPPFLAGS. as well as several bugfixes for the x86 architecture 
and the pgi compiler.

MJ>     setenv PGI /usr/local/lib/PGI
MJ>     setenv LAMHOME /usr/local/lib/LAM/
MJ>     setenv PATH ${PATH}:/usr/local/lib/LAM/bin

MJ> but had littel luck when running. Any suggestion is highly appreciated.


MJ> Futhermore, is there any experience on compiling CPMD with
MJ> large file support on Linux? 

i never tried it so far. and i doubt that it will be of much use, 
if you don't have a _huge_ cluster around (and then you will probably
not want to use a tcp/ip based interconnect anyway).

MJ> Adding in this case to a MPICH-PGI CPMD Makefile 

MJ> -Mlsf 

MJ> as FFLAG 

MJ> and

MJ> -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE

MJ> as CFLAG

MJ> resulted in a perfectly clean compilation but 
MJ> this problem with the executeable:

please note that you usually should compile _everything_ 
with largefile support (=LFS), including mpi libraries. because 
LFS touches the basic read/write interface that is also used 
for communication via pipes or sockets.

MJ> -----------------------------------------------------------------------------

MJ>  PARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARA
MJ>  LOADPA| PROCESSOR    1 HAS NO G COMPONENT.


MJ>  PROGRAM STOPS IN SUBROUTINE LOADPA| TOO MANY PROCESSORS [PROC=   0]
MJ> [0] MPI Abort by user Aborting program !
MJ> [0] Aborting program!
MJ> p0_1356:  p4_error: : 999
MJ> Broken pipe

i haven't used MPICH in a long time, since LAM is much more convenient,
of similar and more consistant speed  (i had noticeable speed drops for
certain numbers of nodes with MPICH). but 999 is the last string that
cpmd usually prints if it finds an error in the input file or no input
file at all. this could happen, if you did not compile with -DPARALLEL ,
etc.

MJ> -----------------------------------------------------------------------------

MJ> Running the same job with a small ( MPICH CMPD ) exec works fine

do you mean without LFS? or just running on a single node?

MJ> Any way out of this?

MJ> Thanks -

MJ> Morten Jensen

best regards,
     axel kohlmeyer.


[snip-snip]
...


--

=======================================================================
Axel Kohlmeyer       e-mail: axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Lehrstuhl fuer Theoretische Chemie          Phone: ++49 (0)234/32-26673
Ruhr-Universitaet Bochum - NC 03/53         Fax:   ++49 (0)234/32-14045
D-44780 Bochum                   http://www.theochem.ruhr-uni-bochum.de
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.



More information about the CPMD-list mailing list