[CPMD-list] MPI-Problem with WF optimization
Alessandro Curioni
cur at zurich.ibm.com
Thu Apr 14 13:06:44 CEST 2005
Bernd,
thank you for input -
this is a known bug and has been corrected in the soon to be next
minor release.
REgards
Alessandro CURIONI, PhD
Research Staff Member
Computational Biochemistry and Material Science group
IBM Research Division - Zurich Research Laboratory
Saumerstrasse 4
8003 Rueschlikon - Switzerland
e-mail: cur at zurich.ibm.com
www: www.zurich.ibm.com
Tel: +41-1-7248633
Fax: +41-1-7248958
Bernd Kallies <kallies at zib.de>
Sent by: cpmd-list-bounces at cpmd.org
04/12/2005 08:41 PM
To
cpmd-list at cpmd.org
cc
Subject
[CPMD-list] MPI-Problem with WF optimization
Dear all,
I ran into a curious problem when running the attached input with CPMD
v3.9.1 (downloaded 3. June 2004) on an IBM p690.
The run aborts in a wavefunction optimization with the error msg from
the MPI-library:
ERROR: 0032-117 User pack or receive buffer is too small (32768) in
MPI_Allreduce, task 0
The error occurs in different stages, depending on the number of tasks
or machine state. Buffer sizes that are detected to be wrong for
MPI_Allreduce differ.
Debugging showed a code problem, which seems to be fundamental to me.
The error is generated because MPI_Allreduce is called by MPI tasks that
are out of sync (calling glosum in different code contexts). The master
task is in different context than the others. The reason for that is
that the variable TNOFOR (set in tol_chk_cnvener) evaluates to different
results (task 0 different from other tasks). The reason for that is that
task 0 has a different total energy than the others on entry of
tol_chk_cnvener. And the reason for that is, that subroutine linesr
(pcgrad.f) contains the lines
IF(PARENT) THEN
CALL EBACK(0)
ENDIF
...
IF(PARENT) THEN
CALL EBACK(1)
ENDIF
This yields different total energies for the MPI tasks when checking for
wavefunction convergence. When letting all tasks backing up and
restoring energy values in linesr, the error mentioned disappears, and
the calculation finishes properly.
It is not really clear to me which impact this finding has, since
line-searching wavefunctions is a task that is done in many cpmd runs.
--the input--
&CPMD
OPTIMIZE GEOMETRY
LBFGS
PCG MINIMIZE
CONVERGENCE ORBITALS
1.d-7
CONVERGENCE ADAPT
0.02
CONVERGENCE ENERGY
0.05
STORE
50
TASKGROUPS
1
&END
&DFT
FUNCTIONAL PBE
GC-CUTOFF
1.d-6
&END
&SYSTEM
ANGSTROM
SYMMETRY
8
CELL ABSOLUTE
8.34 8.34 14.255 0 0 0
CUTOFF
40.0
TESR
4
DUAL
6.0
&END
&ATOMS
*Mg_VDB_PBE.psp BINARY NEWF
LMAX=D
32
0.000 0.000 0.000
4.170 0.000 0.000
2.085 2.085 0.000
[bzfbbk at berni1 test]> cat inp-geo
&CPMD
OPTIMIZE GEOMETRY
LBFGS
PCG MINIMIZE
CONVERGENCE ORBITALS
1.d-7
CONVERGENCE ADAPT
0.02
CONVERGENCE ENERGY
0.05
STORE
50
TASKGROUPS
1
&END
&DFT
FUNCTIONAL PBE
GC-CUTOFF
1.d-6
&END
&SYSTEM
ANGSTROM
SYMMETRY
8
CELL ABSOLUTE
8.34 8.34 14.255 0 0 0
CUTOFF
40.0
TESR
4
DUAL
6.0
&END
&ATOMS
*Mg_VDB_PBE.psp BINARY NEWF
LMAX=D
32
0.000 0.000 0.000
4.170 0.000 0.000
2.085 2.085 0.000
6.255 2.085 0.000
0.000 4.170 0.000
4.170 4.170 0.000
2.085 6.255 0.000
6.255 6.255 0.000
2.085 0.000 2.085
6.255 0.000 2.085
0.000 2.085 2.085
4.170 2.085 2.085
2.085 4.170 2.085
6.255 4.170 2.085
0.000 6.255 2.085
4.170 6.255 2.085
0.000 0.000 4.170
4.170 0.000 4.170
2.085 2.085 4.170
6.255 2.085 4.170
0.000 4.170 4.170
4.170 4.170 4.170
2.085 6.255 4.170
6.255 6.255 4.170
2.085 0.000 6.255
6.255 0.000 6.255
0.000 2.085 6.255
4.170 2.085 6.255
2.085 4.170 6.255
6.255 4.170 6.255
0.000 6.255 6.255
4.170 6.255 6.255
*O_VDB_PBE.psp BINARY NEWF
LMAX=D
32
2.085 0.000 0.000
6.255 0.000 0.000
0.000 2.085 0.000
4.170 2.085 0.000
2.085 4.170 0.000
6.255 4.170 0.000
0.000 6.255 0.000
4.170 6.255 0.000
0.000 0.000 2.085
4.170 0.000 2.085
2.085 2.085 2.085
6.255 2.085 2.085
0.000 4.170 2.085
4.170 4.170 2.085
2.085 6.255 2.085
6.255 6.255 2.085
2.085 0.000 4.170
6.255 0.000 4.170
0.000 2.085 4.170
4.170 2.085 4.170
2.085 4.170 4.170
6.255 4.170 4.170
0.000 6.255 4.170
4.170 6.255 4.170
0.000 0.000 6.255
4.170 0.000 6.255
2.085 2.085 6.255
6.255 2.085 6.255
0.000 4.170 6.255
4.170 4.170 6.255
2.085 6.255 6.255
6.255 6.255 6.255
CONSTRAINTS
FIX ATOMES
32
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
END CONSTRAINTS
&END
--
Dr. Bernd Kallies
Konrad-Zuse-Zentrum für Informationstechnik Berlin
Takustr. 7
14195 Berlin
Tel: +49-30-84185-270
Fax: +49-30-84185-311
e-mail: kallies at zib.de
_______________________________________________
CPMD-list mailing list
CPMD-list at cpmd.org
http://cpmd.org/mailman/listinfo/cpmd-list
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://cpmd.org/pipermail/cpmd-list/attachments/20050414/f732f2aa/attachment.html
More information about the CPMD-list
mailing list