[CPMD-list] NCACHE size

Axel Kohlmeyer axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Fri Aug 27 15:33:51 CEST 2004


>>> "IK" == I Kozin <Kozin> writes:

>> I made some tests on my opteron machine, and cpmd link to atlas is 
>> approx. 1.5 more efficient than
>> cpmd link to acml, so that's right, with cpmd the first thing to do is 
>> to find an optimized blas/lapack lib
>> for your cpu, and then after you should play with the ncache value.
>> Many thanks to the guys who made atlas, (even if the mkl is more 
>> efficient with the itanium2 ).

IK> The difference I see on our Opteron 244 (wat32 benchmark) is about 2%:
IK> Axel's pre-built Atlas for Opterons is slightly quicker than Gnu ACML 2.0.
IK> Does it mean Atlas _has_ to be locally tuned?

igor,

timings on opteron and athlon are not 100% repeatable. so you have to use
tests that run long enough and repeat them several times to get reliable
results. therefore, different ATLAS tuning runs will yield slightly
different results. for the libraries published on my homepage, i usually
run different ATLAS tuning runs on several machines, cross-compare the 
resulting libaries and then pick the one, that has no bad performance
on any machine and is amongst the best (or the best) on all machines.
in fact, the 'opteron' ATLAS library was tuned on an athlon 64 but
turned out to be faster (for CPMD) on all opterons i have access to 
than the natively tuned ones. i also do some manual tuning (cachedge
detection for larger datasets with a second active process on a dual
machine, and larger blas buffers). i found that ACML (which is not GNU, btw.)
gives similar performance (sometimes faster, sometimes slower depending
on the problem) than my ATLAS library.

>> >PB> >btw: the best way to optimize the fft is use it as little as 
>> >PB> >possible (e.g. by using the REAL SPACE WFN KEEP keyword, provided
>> >PB> >there is enough memory on your machine).

IK> Apologies for diverting the topic further...
IK> Just tried REAL SPACE WFN KEEP and got 
IK>   MEMORY| MEMORY REQUIRED:   274776194  WORDS
IK>  PROGRAM STOPS IN SUBROUTINE MEMORY| TOO BIG VALUE

IK> memory.F: 82-91
IK> #if !defined(MALLOC8)
IK> C     Check if 8 * LEN is in the range of integer.
IK>       IF(IRAT.EQ.2) THEN
IK> C       Integer=integer*4 : 2^31=2147483648, 2^28=268435456
IK>         IF(LEN.GT.2**28) THEN
IK>           WRITE(*,*) ' MEMORY| MEMORY REQUIRED:',LEN,' WORDS'
IK>           CALL STOPGM('MEMORY','TOO BIG VALUE')
IK>         ENDIF
IK>       ENDIF
IK> #endif

IK> As far as I can see I can do nothing about IRAT. Can I use MALLOC8?

IRAT is defined in irat.inc and is the ratio of the size of a real*8 
variable and an integer. 2 is the correct value for PGI/opteron.

IK> I remember Axel advised for POINTER8 but against using MALLOC8 a while
IK> ago...

well, supposedly PGI v5.2 does support malloc for more 
than 2GB.  you can always give it a try.

axel.

IK> Igor




--

=======================================================================
Axel Kohlmeyer       e-mail: axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Lehrstuhl fuer Theoretische Chemie          Phone: ++49 (0)234/32-26673
Ruhr-Universitaet Bochum - NC 03/53         Fax:   ++49 (0)234/32-14045
D-44780 Bochum  http://www.theochem.ruhr-uni-bochum.de/~axel.kohlmeyer/
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.



More information about the CPMD-list mailing list