[CPMD-list] CPMD parallel run crash when trying to use kpoints

Axel Kohlmeyer axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Sun Aug 29 13:32:21 CEST 2004


>>> "EL" == EDUARDO JORGE LAMAS <LAMAS> writes:

EL> Hi, sorry if this is sent twice but I sent the first one from an
EL> address that is not subscribed to the list and maybe that's way it
EL> didn't go thru.

EL> I am trying to install CPMD in our Opteron cluster. The compilation
EL> goes well and the executable seems to be working fine except when I try
EL> to use the kpoints keyword in a parallel run (the serial version works
EL> well).

eduardo,

please try running without a swapfile (i.e. without
BLOCK=100). i tried a smaller but similar input and 
it always choked on the swapfile handling. i then tried
to run the same input with older executables and the latest
executable, that would work, was a version 3.3 binary.

regards,
        axel.

EL> The cpmd version that I am trying to install is the last one (3.9.1)
EL> but I had the same problem with version 3.7.2. The atlas library I am
EL> using is the one that is available at Axel Kohlmeyer's web page.

EL> The error I am getting is: 

EL> PARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARA
EL>   NCPU     NGW     NHG  PLANES  GXRAYS  HXRAYS ORBITALS Z-PLANES
EL>      0     924    6042       5      98     350      11       1
EL>      1     924    6040       5      98     350      10       1
EL>      2     922    6036       5      98     350      11       1
EL>      3     920    6036       5      98     350      11       1
EL>      4     922    6036       5      98     350      10       1
EL>      5     923    6040       5     100     350      11       1
EL>      6     922    6037       5     100     350      11       1
EL>      7     925    6029       5     100     350      10       1
EL>      8     925    6037       5     100     350      11       1
EL>      9     926    6020       5      99     349      11       1
EL>     10     925    6035       5     100     348      10       1
EL>     11     924    6038       5     100     348      11       1
EL>                 G=0 COMPONENT ON PROCESSOR :     9
EL>  PARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARA

EL>  ***    LOADPA| THE NEW SIZE OF THE PROGRAM IS    3544 kBYTES ***
EL>  ***     RGGEN| THE NEW SIZE OF THE PROGRAM IS    3688 kBYTES ***
EL> p4_2993:  p4_error: interrupt SIGFPE: 8
EL> p8_3601:  p4_error: interrupt SIGFPE: 8
EL> p1_11826:  p4_error: interrupt SIGFPE: 8
EL> p5_2995:  p4_error: interrupt SIGFPE: 8
EL> p2_11827:  p4_error: interrupt SIGFPE: 8
EL> p6_2996:  p4_error: interrupt SIGFPE: 8
EL> p9_3602:  p4_error: interrupt SIGFPE: 8
EL> p7_2997:  p4_error: interrupt SIGFPE: 8
EL> p10_3603:  p4_error: interrupt SIGFPE: 8
EL> p11_3604:  p4_error: interrupt SIGFPE: 8
EL> bm_list_11825: (1.570312) net_send: could not write to fd=5, errno = 32
EL> bm_list_11825:  p4_error: net_send write: -1

EL> The same system will work ok in parallel if the kpoint keyword is removed.

EL> My make file is:

EL> SRC  = .
EL> DEST = .
EL> BIN  = .
EL> #QMMM_FLAGS = -D__QMECHCOUPL
EL> #QMMM_LIBS  = -L. -lmm
EL> FFLAGS = -r8 -pc=64 -Msignextend
EL> LFLAGS = -Bstatic -L. -latlas $(QMMM_LIBS)
EL> CFLAGS =
EL> CPP = /lib/cpp -P -C -traditional
EL> CPPFLAGS = -D__Linux -D__PGI -DLAPACK -DFFT_DEFAULT -DPOINTER8 -D__pgf90 \
EL>                -DPARALLEL -DMP_LIBRARY=__MPI
EL> NOOPT_FLAG =
EL> CC = cc
EL> FC = mpif90 -c -O0 -tp k8-64
EL> LD = mpif90 -O0 -tp k8-64
EL> AR =

EL> And my input file is:

EL> &INFO
EL>   Wavefunction optimization bulk platinum
EL> &END
EL> &CPMD
EL>     rESTART WAVEFUNCTIONS OCCUPATION KPOINTS LATEST
EL>     OPTIMIZE WAVEFUNCTION
EL>     LSD
EL>     FREE ENERGY FUNCTIONAL
EL>     ELECTRON TEMPERATURE
EL>       1000.
EL>     STORE
EL>       5
EL> &END
EL> &DFT
EL>    FUNCTIONAL BLYP
EL> &END
EL> &SYSTEM
EL>    POINT GROUP
EL>     AUTO
EL>    SYMMETRY
EL>     14
EL>    CELL DEGREE
EL>      5.54846   1   1.5   90   90  120
EL>    CUTOFF
EL>      80.000
EL>     ANGSTROMS
EL>     TESR
EL>      3
EL>     KPOINTS MONKHORST-PACK BLOCK=100
EL>      5 5 1
EL> &END
EL> &ATOMS
EL> *Pt_TM_BLYPspd5.psp GAUSS-HERMIT=10 NLCC
EL>  LMAX=D LOC=S
EL>  12
EL>         0.00000 0.00000 0.00000
EL> .....
EL> .....
EL>         1.60193 0.00000 4.53093
EL> &END
EL> &BASIS
EL>  PSEUDO AO 2 OCUPPATION
EL> 0   2
EL> 1   9
EL> &END


EL> Any help will be appreciated.
EL> Best Regards,

EL> Eduardo

EL> _______________________________________________
EL> CPMD-list mailing list
EL> CPMD-list at cpmd.org
EL> http://cpmd.org/mailman/listinfo/cpmd-list



--

=======================================================================
Axel Kohlmeyer       e-mail: axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Lehrstuhl fuer Theoretische Chemie          Phone: ++49 (0)234/32-26673
Ruhr-Universitaet Bochum - NC 03/53         Fax:   ++49 (0)234/32-14045
D-44780 Bochum  http://www.theochem.ruhr-uni-bochum.de/~axel.kohlmeyer/
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.



More information about the CPMD-list mailing list