[CPMD-list] CPMD parallel scalability

Maurice de Koning dekoning at ifi.unicamp.br
Tue Apr 15 13:58:15 CET 2008


Hi all

Setting the parameter  OMP_NUM_THREADS to 1 did the trick. Using the 
same input file, the system scales
OK now, as one can see from the attached outputs for 8, 16, 32  and  44 
CPU´s.

Thanks for your help!

Maurice


Axel Kohlmeyer wrote:
> On Mon, 14 Apr 2008, Maurice de Koning wrote:
>
> maurice,
>
> MdK> Hi all,
> MdK> 
> MdK> I´m running CPMD on an Altix 4700 system with 44 CPU´s and 88 Gb of RAM 
> MdK> memory.
> MdK> At the moment I´m running a CP MD run of a cell containing 96 water 
> MdK> molecules using the
> MdK> BLYP functional at 300 K. I noticed that the scalability is not very 
> MdK> good. If I run on more than
>
> please check carefully how your job is propagated through the
> machine and what settings you use to compile and what tools.
>
> i have access to an very new altix4700 and noticed some oddities.
>
> - when using intel MKL you have to set OMP_NUM_THREADS to 1 or else
>   MKL will try to multi-thread across the whole machine or at least
>   across one blade (two dual-core cpus). if that overlaps with your
>   MPI parallelization you are screwed.
>
>   BTW: regardless of your sysadmins tell you, don't compile in OpenMP,
>   and better link MKL without threading support. i tried a hybrid 
>   compile and it does work, but its performance is inferior to MPI.
>
> - make sure that you use SGI's MPI. i tried compiling my own MPI
>   because of a bug in SGI's MPI that affects path-integrals in CPMD,
>   but those jobs would not go across more than one blade (= 4cpus).
>
> - check that you have enough memory (i.e. that nobody else is using
>   excessive amounts of memory). using more cpus with increase the
>   total memory usage and on top of that the SGI mpi will create
>   large RDMA buffers across the whole address space for each MPI task
>   unless instructed via environment variable to not do so.
>   
> MdK> about 16 CPU´s, the time per MD step starts even increases, such that 
> MdK> the total time starts growing with the
>
> on most linux machine the TCPU number is pretty much useless, 
> particularly with multi-threading (as it includes the combined 
> cpu time of all threads but not the time spent, e.g. swapping). 
> always check the ELAPSED TIME at the end.
>
> MdK> number of CPU´s. Is there anything I can do about this?
>
> as alessandro already mentioned, your system should scale 
> well. thus experience tells us that your scaling problems
> are either a problem of the machine setup or of the way how
> you run your job or of how you compiled the exectutable. unless
> your provide more details, nobody will be able to give a 
> specific advice. there is just too much guesswork needed.
>  
> MdK> Below is a part of the input script
>
> this is useless and quoting incomplete inputs is turning into 
> a IMNSHO really bad habit on this list. _if_ you made an error 
> in the input it is most likely in the part that you didn't quote.
>
> so either post the whole file, or make it available via some webserver 
> or don't post anything, or even better use one of the test examples from 
> CPMD-test archive. we know they work, everybody can download them if 
> needed and many of us already have done tests with them.
>
> thanks,
>    axel.
>
>
> MdK> 
> MdK> Cheers,
> MdK> 
> MdK> Maurice
> MdK> 
>
>   

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: LiquidCPMD44.out2
Url: http://cpmd.org/pipermail/cpmd-list/attachments/20080415/2cc4df33/attachment-0004.cc 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: LiquidCPMD32.out2
Url: http://cpmd.org/pipermail/cpmd-list/attachments/20080415/2cc4df33/attachment-0005.cc 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: LiquidCPMD16.out2
Url: http://cpmd.org/pipermail/cpmd-list/attachments/20080415/2cc4df33/attachment-0006.cc 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: LiquidCPMD8.out2
Url: http://cpmd.org/pipermail/cpmd-list/attachments/20080415/2cc4df33/attachment-0007.cc 


More information about the CPMD-list mailing list