[CPMD-list] problems with MD -- conservation of energy and translational drift

Axel Kohlmeyer axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Wed Feb 16 11:31:06 CET 2005


>>> "JS" == jernej  <jernej at cmm.ki.si> writes:

JS> Dear Axel,

dear jernej,

[...]

JS> Regarding compilation -- we've been using the Intel Fortran compiler package and
JS> it took quite a long time before successfully compiling the parallel code for
JS> mpich. Here are the main flags with which we finally made it:

JS> *************************************
JS> FFLAGS = -c -r8 -w90 -w95 -O3 -pc64 -axM -ip -tpp7

your main problem is the -ip flag. for a large and in its time
critical subroutines already optimized code like CPMD there is
no benefit from inter-procedural optimization. i recently made
some tests and found a _decrease_ in cpu speed of about 5%.
i found similar effects when using the SIMD vectorizer (-axM).
finally replacing -O3 with -O2 -unroll also increased the 
CPMD performance (while at the same time cutting down 
compilation time). so my 'fastest' FFLAGS setting for intel on P4 are:

FFLAGS = -c -r8 -w95 -O2 -pc64 -tpp7 -unroll -cm -tune pn4 -arch pn4

compared to your original flags i found a total increase in execution
speed of over 15% in some cases with this 'reduced' optmization.

JS> LFLAGS = -llapack-ifc -lf77blas-ifc -latlas-ifc -Vaxlib $(QMMM_LIBS)
JS> LFLAGS = -L/net/pro/opt/intel/mkl/lib/32 -L. -lmkl_lapack -lmkl_def -Vaxlib
JS> -lpthread $(QMMM_LIBS) -static

hmmm, why are you using the generic mkl instead of the P4 version?
especially in the BLAS/LAPACK subroutines you benefit the most from
the SIMD instructions of the P4 cpu.

if you don't have a P4/Xeon cpu, however, you should not use
the P4 tuning flags (-tpp7 -tune pn4 -arch pn4) since the architecture
of most other x86 cpus (pentium 3, AMD athlon, opteron) works better
with -tpp6 -tune pn3 -arch pn3. the p4 architecture is quite
different from those and thus requires different optimizations.

next, on newer linux machines, you should not use -static, since
even static binaries will read some shared objects. in case of
static linking, they have to be _binary_ identical. so your binary
may fail in strange ways, when you update the system, or want to
run it on different installations. if libc, libm, and libpthread 
are linked dynamically, however, the dynamic linker can utilize
the backward compatibility layer of newer library versions and the
binary will continue to work correctly.
for the very latest version of their fortran compiler, intel introduced
the -i-static flag, which will have only the intel provided libraries
linked statically. for older versions you can use -static-libcxa
instead. you can run 'ifort -help' to see if -i-static is supported.

JS> CFLAGS = -c -O2 -Wall  -I/net/pro/opt/intel/mkl/include
JS> CPP = /net/pro/lib/cpp -P -C -traditional
JS> CPPFLAGS = -D__Linux -D__PGI -DLAPACK -DFFT_DEFAULT -DLINUX_IFC \
JS>            -DPARALLEL -DMP_LIBRARY=__MPI  
JS> NOOPT_FLAG = 
JS> CC = gcc-3.2 
JS> FC = $(HOME)/mpi/bin/mpif77
JS> LD = $(HOME)/mpi/bin/mpif77
JS> *************************************


JS> Thanks again, and best regards,

your are welcome.

best regards,
     axel.


JS> Jernej

JS> _______________________________________________
JS> CPMD-list mailing list
JS> CPMD-list at cpmd.org
JS> http://cpmd.org/mailman/listinfo/cpmd-list



--

=======================================================================
Axel Kohlmeyer       e-mail: axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Lehrstuhl fuer Theoretische Chemie          Phone: ++49 (0)234/32-26673
Ruhr-Universitaet Bochum - NC 03/53         Fax:   ++49 (0)234/32-14045
D-44780 Bochum  http://www.theochem.ruhr-uni-bochum.de/~axel.kohlmeyer/
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.



More information about the CPMD-list mailing list