[CPMD-list] MPI and SMP on P4 Linux using PGI compilers
Axel Kohlmeyer
axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Tue Aug 12 13:13:45 CEST 2003
>>> "JK" == cpmd <cpmd at kressworks.com> writes:
JK> When I do
JK> # Configure
JK> to get a list of target platforms, I see
JK> PC-PGI PC-PGI-MPI
JK> PC-PGI-MPI-QMMM PC-PGI-QMMM
JK> PC-IFC-MPI PC-IFC
JK> and
JK> IBM-SP3 IBM-SP3-SMP
JK> IBM-SP4 IBM-SP4-SMP
JK> IBM-SP4-64 IBM-SP4-SMP-64
JK> but no SMP for PC-PGI. Is SMP supported on Linux PCs using Reh Hat 7.3 and
JK> PGI compilers? How about PC-PGI-MPI-SMP?
hi jim,
smp configurations are supported by using openmp directives.
i have been (so far) successfully able to compile and test
such a binary with the PGI compiler (just add the flag -mp).
with the intel compiler you should use the flag -openmp.
so much for the good news. now the bad news:
it does not improve the performance on any pc hardware
i could get my hands on so far.
to explain: the OpenMP parallelization is not as complete
as the MPI parallelization. so for a small number of nodes,
you are almost always better off using MPI parallelization
over a combination of shared memory and TCP/IP (lam-mpi,
for example, does this pretty efficiently).
as far as i understand it, cpmd does the mpi distributed memory
parallization by distributing the g-space over the mpi-nodes.
there is a scalability limit to this. sooner or later the amount
of work is distributed so unevenly over the nodes, that the time
they have to wait for the slowest node to finish, takes longer
that the time you win by adding nodes. this is the regime where
the openmp parallelization is supposed to kick in. unfortunately,
with current pc hardware this does not seem to be true. when you
compile for openmp you will lose one register and there also is
some overhead in spawning threads etc. in my tests, this seems
to completely eat up the speedup of the openmp parallization.
in fact it was faster to use only one (of the two cpus) on each node.
you also have to keep in mind that pc hardware is so memory bandwidth
starved, that using the second cpu significantly slows down the
fist cpu (just run two single cpu jobs simultaneously and compare the
timing to a single cpu job when leaving the second cpu idle).
in my tests i got on dual athlon machines with AMD-760MPX chipset
and PC266-DDR-ECC memory about 160% computing power of a single
cpu machine when running two jobs simulatneously. on a dual pentium 4
machine things get even worse. on a intel e7500 dual channel(!)
PC200-DDR memory board i got about 120% the computing power when
using the second cpu. in fact, when i activated openmp/multithreading
the job got slower compared to the single cpu job. the penalty seems
a little smaller, when a ran _really_ small jobs.
to summarize,
if you want to get the most out of your smp-pcs, try the following:
- if you have a limited number of smp-nodes (say less than 32-48 cpus,
depending on the size of the system and the cutoff), you should
use mpi parallelization only.
- if you have more nodes available, try using them in single cpu
mode first and compare to the full-mpi/openmp-mpi speed.
- try to vary the number of nodes. due to the distributed memory
parallelization strategy, there are 'magic numbers' that may vary
from system to system and with the cutoff where you get a very
good load distribution and thus a good overall performace.
hope this helps,
axel.
JK> Thanks.
JK> Jim Kress
JK> _______________________________________________
JK> CPMD-list mailing list
JK> CPMD-list at cpmd.org
JK> http://www.cpmd.org/mailman/listinfo/cpmd-list
--
=======================================================================
Axel Kohlmeyer e-mail: axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Lehrstuhl fuer Theoretische Chemie Phone: ++49 (0)234/32-26673
Ruhr-Universitaet Bochum - NC 03/53 Fax: ++49 (0)234/32-14045
D-44780 Bochum http://www.theochem.ruhr-uni-bochum.de
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.
More information about the CPMD-list
mailing list