[CPMD-list] Running CPMD on multi-core Opeton cluster

Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu
Tue Oct 31 20:41:25 CET 2006


On Tue, 31 Oct 2006, Yiming Zhang wrote:

YZ> Dear CPMD community,

dear yiming,

YZ> I am currently running CPMD on a Opteron AMD64 cluster, every node has 4 CPU
YZ> (2 dual core Opterons).
YZ> I am able to parallel CPMD using MPI MPD, an example script is pasted below.
YZ> When I ask for 32 for $NSLOTS. it actually runs on 32 nodes with only 1 of 4
YZ> cpus per node running. I am wondering if there is a way to fully utilize the
YZ> cluster, running on all cpus on 32 nodes, then totally I can have 128 cpus.

YZ> If I simply ask for larger $NSLOTS, the MPD quit with error message like
YZ> "more processors asked than available", I can't remember exactly.

there are many possible reasons why this may or may not work.
none of them are related to cpmd in any way, but affect any
MPI program. the reasons may be the way the batch environment
is set up (i hope this is not the same machine that phillip
shemella is using...), it may be how you requested the nodes
(on setups you have to have to explicitely require multiple
processors per node, or the batch environment with distribute
requested slots first across nodes), it may be that you are
not using the mpi environment correctly.

YZ> Any suggestions are welcome,
YZ> Yiming Zhang

i'm adding a suggestion for a hack which assumes, that NSLOTS
represent actually the full nodes and not the slots/cpus.

it is probably best that you get in contact with the people
running the machine and clarify the situation (best use one
of the trivial mpi examples, so you can test it faster...).

YZ> 

for a more detailed analysis one needs to know
a lot about your local environment (each machine
is different, since each sysadmin has an own 
'handwriting' in setting up a machine...).

for example:
- what batch system is used?
- what are the job requirements you specify on submission
- how are $NSLOTS and $MACHINEFILE determined?
- what MPI package do you use?
- what are the contents of $MACHINEFILE?
and so on...


YZ> mpdboot -n $NSLOTS -f $MACHINEFILE
YZ> mpdtrace -l

try adding here (assuming you use a bourne shell): 

NCPU=`expr $NSLOTS \* 4`

YZ> mpiexec -n $NSLOTS /borg/yiming/borg_cpmd.x $base/inp > $base/out

and change this line to:

mpiexec -n $NCPU /borg/yiming/borg_cpmd.x $base/inp > $base/out

YZ> mpdallexit
YZ> 

best regards,
   axel.

-- 
=======================================================================
Axel Kohlmeyer   akohlmey at cmm.chem.upenn.edu   http://www.cmm.upenn.edu
   Center for Molecular Modeling   --   University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582,  fax: 1-215-573-6233,  office-tel: 1-215-898-5425
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.




More information about the CPMD-list mailing list