[CPMD-list] Platform-dependent wave function optimization results
Axel Kohlmeyer
akohlmey at cmm.chem.upenn.edu
Tue May 1 16:44:35 CEST 2007
On Tue, 1 May 2007, Vladimir Stegailov wrote:
VS> Dear Axel,
dear vladimir,
[...]
VS> However that figures differ from the other case
VS> 3. cpmd compiled with IFC v. 9.0 and MKL 8.0.1, -O3 optimization
-O3 is generally best avoided with the intel compiler on any
quantum chemistry package. as of version 9.0 a few more aggressive
optimizations are enabled by default for -O3. regardles, i usually
found that '-O2 -unroll -pc64' to be the best combination of flags
for many quantum chemical software packages to get a fast and
sufficiently accurate/reliable compilation with intel compilers.
in fact on some machines (older pentium 4) it turned out to be
up to 20% faster than using -O3 + SSE vectorization (each contributed
about half of the slowdown).
VS> For example, the output from the very first lanczos diagonalization step
VS> in the 1st and 2nd case is
VS> <<1:4<<<<<<<<<<<<<< LANCZOS DIAGONALIZATION <<<<<<<<<<<<<<<<<<<<
VS> >> TIME FOR INITIAL SUBSPACE DIAGONALIZATION: 21.98
VS> >> CYCLE NCONV B2MAX B2MIN #HPSI TIME
VS> 1 0 3.367E-07 2.547E-11 6.00 77.39
VS>
VS> in the 3rd case is
VS> <<1:4<<<<<<<<<<<<<< LANCZOS DIAGONALIZATION <<<<<<<<<<<<<<<<<<<<
VS> >> TIME FOR INITIAL SUBSPACE DIAGONALIZATION: 0.61
VS> >> CYCLE NCONV B2MAX B2MIN #HPSI TIME
VS> 1 0 3.360E-07 2.475E-11 6.00 2.18
VS>
VS> In the 3rd case WFO goes without warnings as well.
VS>
VS> Should I consider this difference in the output from the -O0 and -O3 binaries as an error?
hard to tell. i don't know the code so well. it might be worth comparing
to a completely different platform. it looks a bit suspicious.
VS> Is it acceptable that the output changes slightly (on the round-off
VS> errors level) when the optimization level is increased?
yes. some rounding errors are to be expected, but they are difficult
to distinguish from memory corruption or not properly initialized
arrays...
the intel compilers have the -zero flag, might be worth trying out.
if you see a difference, that you have some uninitialized arrays.
VS> > not a good sign. please note that the FEMD code is not thoroughly
VS> > tested on linux compilers.
VS> Please could you specify on which compilers the FEMD code was
VS> tested: xlf, PGI fortran ... ?
the last test of FEMD that i am aware of was in 2003 on an IBM regatta
using cpmd 3.7.2 and the IBM xlf compilers. most of the older
parts of the code were actually developed on IBM workstations!
cheers,
axel.
VS>
VS> Thank you.
VS>
VS> Kind regards,
VS> Vladimir
VS>
VS>
VS> ----- Original Message -----
VS> From: "Axel Kohlmeyer" <akohlmey at cmm.chem.upenn.edu>
VS> To: "Vladimir Stegailov" <stegailov at ihed.ras.ru>
VS> Cc: <cpmd-list at cpmd.org>
VS> Sent: Thursday, April 26, 2007 9:47 PM
VS> Subject: Re: [CPMD-list] Platform-dependent wave function optimization results
VS>
VS>
VS> > On Thu, 26 Apr 2007, Vladimir Stegailov wrote:
VS> >
VS> > vladimir,
VS> >
VS> >
VS> > VS> Dear colleagues,
VS> > VS>
VS> >
VS> > VS> is it normal to get essentially different WFO processes on different
VS> > VS> platforms using the same input script?
VS> >
VS> > how different is different. there are some small differences
VS> > possible, but within the accuracy of the method and parameters
VS> > you should get the same results.
VS> >
VS> > it is most likely, that you have a miscompiled binary
VS> > or a library with errors or numerical instabilities.
VS> >
VS> > VS> I use the 3.11 version and compare two platforms:
VS> > VS> 1. cpmd compiled with IFC v. 8.0 and libatlas_p4.a
VS> >
VS> > intel fortran 8.0 had a lot of problems, particularly
VS> > the original release. you have to upgrade to the latest
VS> > patchlevel, best to the latest patchlevel of 8.1, which
VS> > works quite reliable. if this is the bochum atlas binary,
VS> > please keep in mind, that it is now very old and has not
VS> > been tested thoroughly tested against less frequently
VS> > used parts of cpmd. it should be easy to cross check
VS> > against a different BLAS/LAPACK, e.g. mkl to see if
VS> > it the problem is in the compiler or the BLAS support.
VS> >
VS> > VS> 2. cpmd compiled with IFC v. 9.0 and MKL 8.0.1
VS> >
VS> > intel 9.0 in the original release was also very problematic,
VS> > though not as bad as 8.0. but please make sure you upgraded
VS> > to the latest patchlevel.
VS> >
VS> > VS> The input script is given below.
VS> >
VS> > VS> In the 1st case the WFO process initially gives several "FRIESNER_C|
VS> > VS> EIGENVECTOR 4 IS VERY BAD!" warnings, but eventually goes well and
VS> > VS> stops at NFI=179.
VS> >
VS> > not a good sign. please note that the FEMD code is not thoroughly
VS> > tested on linux compilers. i would recommend a cross-check compiling
VS> > everthing with -zero to make sure everthing is initialized. it appears
VS> > that some parts of the code (still) rely on the fact that the compiler
VS> > does this for you.
VS> >
VS> > VS> In the 2nd case there are no warnings and WFO stops much quicker at NFI=44.
VS> >
VS> > VS> Is the reason just the difference in FFT libraries? Could it be
VS> > VS> specific to the FEMD simulations?
VS> >
VS> > there is no indication that you have a different FFT. but also different
VS> > BLAS implementations can use (slightly) different algorithms and thus
VS> > produce differences that may direct the wavefunction optimization into
VS> > a different direction. since the FEMD code is used infrequently, there
VS> > is a much higher chance that some of it gets miscompiled. it can also
VS> > be more sensitive to small differences in the libraries or compiler
VS> > optimizations.
VS> >
VS> > the simplest tests are swapping libraries, turning off compiler
VS> > optimization (-O0) and swapping compilers. the risk of miscompilation
VS> > increases with higher optimization level.
VS> >
VS> > cheers,
VS> > axel.
VS> >
VS> > VS>
VS> > VS> I would appreciate any comments!
VS> > VS>
VS> > VS> Vladimir
VS> > VS>
VS> > VS>
VS> > VS> &CPMD
VS> > VS>
VS> > VS> FILEPATH
VS> > VS>
VS> > VS> /home/stegailov/testing/cpmd/md3/
VS> > VS>
VS> > VS> OPTIMIZE WAVEFUNCTION
VS> > VS>
VS> > VS> UNIT HESSIAN
VS> > VS>
VS> > VS> BFGS
VS> > VS>
VS> > VS> FREE ENERGY FUNCTIONAL
VS> > VS>
VS> > VS> LANCZOS DIAGONALISATION
VS> > VS>
VS> > VS> LANCZOS PARAMETERS
VS> > VS>
VS> > VS> 1 6 10 1.D-18
VS> > VS>
VS> > VS> TROTTER FACTOR
VS> > VS>
VS> > VS> 0.001
VS> > VS>
VS> > VS> BOGOLIUBOV CORRECTION OFF
VS> > VS>
VS> > VS> GRAM-SCHMIDT ORTHOGONALISATION
VS> > VS>
VS> > VS> CONVERGENCE
VS> > VS>
VS> > VS> 1.D-6 5.D-6
VS> > VS>
VS> > VS> MAXSTEP
VS> > VS>
VS> > VS> 5000
VS> > VS>
VS> > VS> BROYDEN MIXING
VS> > VS>
VS> > VS> 0.15 200 0.01 0 8
VS> > VS>
VS> > VS> ALEXANDER MIXING
VS> > VS>
VS> > VS> 1.1
VS> > VS>
VS> > VS> TEMPERATURE
VS> > VS>
VS> > VS> 400.
VS> > VS>
VS> > VS> ELECTRON TEMPERATURE
VS> > VS>
VS> > VS> 10000.
VS> > VS>
VS> > VS> COMPRESS WRITE32
VS> > VS>
VS> > VS> STRUCTURE BONDS
VS> > VS>
VS> > VS> ENERGYBANDS
VS> > VS>
VS> > VS> ELECTROSTATIC POTENTIAL
VS> > VS>
VS> > VS> RHOOUT
VS> > VS>
VS> > VS> &END
VS> > VS>
VS> > VS>
VS> > VS>
VS> > VS> &SYSTEM
VS> > VS>
VS> > VS> POINT GROUP
VS> > VS>
VS> > VS> AUTO
VS> > VS>
VS> > VS> SYMMETRY
VS> > VS>
VS> > VS> 1
VS> > VS>
VS> > VS> CELL
VS> > VS>
VS> > VS> 8.064 1.0 1.0 0.0 0.0 0.0 (8.064=2*4.032, 4.032A is the eq lattice const of Al)
VS> > VS>
VS> > VS> CUTOFF
VS> > VS>
VS> > VS> 15.000
VS> > VS>
VS> > VS> ANGSTROMS
VS> > VS>
VS> > VS> STATES
VS> > VS>
VS> > VS> 250
VS> > VS>
VS> > VS> SCALE
VS> > VS>
VS> > VS> TESR
VS> > VS>
VS> > VS> 1
VS> > VS>
VS> > VS> KPOINTS MONKHORST-PACK FULL
VS> > VS>
VS> > VS> 2 2 2
VS> > VS>
VS> > VS> &END
VS> > VS>
VS> > VS>
VS> > VS>
VS> > VS> &ATOMS
VS> > VS>
VS> > VS> *AL_SGS KLEINMAN-BYLANDER
VS> > VS>
VS> > VS> LMAX=D
VS> > VS>
VS> > VS> 32
VS> > VS>
VS> > VS> 0.0178998 -0.0197471 0.0145112
VS> > VS>
VS> > VS> ...
VS> > VS>
VS> > VS> 0.500011 0.750491 0.756745
VS> > VS>
VS> > VS>
VS> > VS>
VS> > VS>
VS> > VS>
VS> > VS> VELOCITIES
VS> > VS>
VS> > VS> 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
VS> > VS>
VS> > VS> -5.10982 0.101812 -1.43574
VS> > VS>
VS> > VS> ...
VS> > VS>
VS> > VS> 7.04835 2.96887 2.84766
VS> > VS>
VS> > VS> END VELOCITIES
VS> > VS>
VS> > VS>
VS> > VS>
VS> > VS> &END
VS> > VS>
VS> > VS>
VS> > VS>
VS> > VS> &BASIS
VS> > VS>
VS> > VS> PSEUDO AO 2
VS> > VS>
VS> > VS> 0 1
VS> > VS>
VS> > VS> &END
VS> > VS>
VS> > VS>
VS> > VS>
VS> > VS> &DFT
VS> > VS>
VS> > VS> NEWCODE
VS> > VS>
VS> > VS> &END
VS> > VS>
VS> > VS>
VS> > VS>
VS> > VS>
VS> > VS>
VS> >
VS> > --
VS> > =======================================================================
VS> > Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu http://www.cmm.upenn.edu
VS> > Center for Molecular Modeling -- University of Pennsylvania
VS> > Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
VS> > tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425
VS> > =======================================================================
VS> > If you make something idiot-proof, the universe creates a better idiot.
VS> >
--
=======================================================================
Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu http://www.cmm.upenn.edu
Center for Molecular Modeling -- University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.
More information about the CPMD-list
mailing list