[CPMD-list] CPMD parallel scalability
Axel Kohlmeyer
akohlmey at cmm.chem.upenn.edu
Mon Apr 14 15:06:20 CET 2008
On Mon, 14 Apr 2008, Maurice de Koning wrote:
maurice,
MdK> Hi all,
MdK>
MdK> I´m running CPMD on an Altix 4700 system with 44 CPU´s and 88 Gb of RAM
MdK> memory.
MdK> At the moment I´m running a CP MD run of a cell containing 96 water
MdK> molecules using the
MdK> BLYP functional at 300 K. I noticed that the scalability is not very
MdK> good. If I run on more than
please check carefully how your job is propagated through the
machine and what settings you use to compile and what tools.
i have access to an very new altix4700 and noticed some oddities.
- when using intel MKL you have to set OMP_NUM_THREADS to 1 or else
MKL will try to multi-thread across the whole machine or at least
across one blade (two dual-core cpus). if that overlaps with your
MPI parallelization you are screwed.
BTW: regardless of your sysadmins tell you, don't compile in OpenMP,
and better link MKL without threading support. i tried a hybrid
compile and it does work, but its performance is inferior to MPI.
- make sure that you use SGI's MPI. i tried compiling my own MPI
because of a bug in SGI's MPI that affects path-integrals in CPMD,
but those jobs would not go across more than one blade (= 4cpus).
- check that you have enough memory (i.e. that nobody else is using
excessive amounts of memory). using more cpus with increase the
total memory usage and on top of that the SGI mpi will create
large RDMA buffers across the whole address space for each MPI task
unless instructed via environment variable to not do so.
MdK> about 16 CPU´s, the time per MD step starts even increases, such that
MdK> the total time starts growing with the
on most linux machine the TCPU number is pretty much useless,
particularly with multi-threading (as it includes the combined
cpu time of all threads but not the time spent, e.g. swapping).
always check the ELAPSED TIME at the end.
MdK> number of CPU´s. Is there anything I can do about this?
as alessandro already mentioned, your system should scale
well. thus experience tells us that your scaling problems
are either a problem of the machine setup or of the way how
you run your job or of how you compiled the exectutable. unless
your provide more details, nobody will be able to give a
specific advice. there is just too much guesswork needed.
MdK> Below is a part of the input script
this is useless and quoting incomplete inputs is turning into
a IMNSHO really bad habit on this list. _if_ you made an error
in the input it is most likely in the part that you didn't quote.
so either post the whole file, or make it available via some webserver
or don't post anything, or even better use one of the test examples from
CPMD-test archive. we know they work, everybody can download them if
needed and many of us already have done tests with them.
thanks,
axel.
MdK>
MdK> Cheers,
MdK>
MdK> Maurice
MdK>
--
=======================================================================
Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu http://www.cmm.upenn.edu
Center for Molecular Modeling -- University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.
More information about the CPMD-list
mailing list