[CPMD-list] Parallelization Problem

Axel Kohlmeyer akohlmey at vitae.cmm.upenn.edu
Fri Jul 21 15:51:33 CEST 2006


On 7/21/06, Michele Monteferrante <m.monteferrante at caspur.it> wrote:
> Dear CPMD users,

dear michele,

> i am using a power 5 machine with cpmd 3.9.2.

are you talking about this machine?
http://www.caspur.it/attivitaeservizi/calcolo/sistemi/ibm/

> The time needed to make a CP step increase drastically if i use more
> than 16 CPU
> for my calculation. For example with 16 CPU this time is about 7s
> while with
> 20 CPU it is about 60s .

there are many possible reasons for that.
for a machine like that, the major bottleneck for large
MPI jobs is the inter-node communication. as long as
you have only cpus within one node, the communication
is very fast (via shared memory). as soon as your job
spans multiple nodes, you have to channel many cpu-to-cpu
communications through a single link, since MPI has no
knowledge of the topology of your machine and thus you
cannot adapt to it.

in the case of CPMD you have the option to compile a mixed-
OpenMP/MPI parallel binary and then use, e.g., 4 OpenMP
threads per MPI thread and thus reduce the inter-node communication.
mind you, the OpenMP parallelization is less effective than
MPI, so it only makes sense for larger jobs.

other than that, efficient parallelization also depends on the
size of the system and a few other tricks in the input file.
e.g. if you'd upgrade to version 3.11.1, you can use
REAL SPACE WFN KEEP in the &CPMD section to reduce
the number of (parallel) fourier transforms, which should
give a significant speed benefit in your case (as it does in
many cases).

> This is the makefile i used.

another thing you have to watch, is that you really use the fast
network and not the gigabit ethernet. my experience in running
on those kind of nodes is miniscule, so i don't know for sure.
on some machines, you have to use a special flag or set a special
property in the job submit script to get access to the fast network.

but then again, unless you are running huge systems, it is probably
best to tune your jobs, so that they stay within a single node, so that
you get the best throughput.

ciao,
    axel.

> Thanks
> Ciao
>
> Michele
>
>
>
>
> _______________________________________________
> CPMD-list mailing list
> CPMD-list at cpmd.org
> http://cpmd.org/mailman/listinfo/cpmd-list
>
>
>
>


-- 
=======================================================================
Axel Kohlmeyer   akohlmey at cmm.chem.upenn.edu   http://www.cmm.upenn.edu
  Center for Molecular Modeling   --   University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582,  fax: 1-215-573-6233,  office-tel: 1-215-898-5425
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.



More information about the CPMD-list mailing list