[CPMD-list] A missing library?

Carl Krauthauser carl at UDel.Edu
Sun May 29 20:21:50 CEST 2005


Dear Axel,

Thank you very much for taking the time to address this issue for me, it 
is greatly appreciated!

The LAPACK library I am using is actually the merged version.  I am 
curious, however.  You mention below that since there are no symbols 
with underscores, I should compile the ATLAS libraries using g77. 
Should I continue to compile the "regular" LAPACK and BLAS libraries 
with ifort, and then merge the ATLAS subset of LAPACK built with g77 
wrappers with the "regular" LAPACK built with ifort?  This will be okay? 
  This is where my lack of knowledge of the code internals of LAPACK and 
ATLAS are truly handicapping me!

I'll try building the ATLAS strictly with the g77 wrappers, and see what 
happens.  Thanks Again, Axel, your assistance is invaluable!


Best Regards,
Carl

Axel Kohlmeyer wrote:
> On Sat, 28 May 2005, Carl Krauthauser wrote:
> 
> dear carl,
> 
> CK> > difference should be quite small, since a p3 and an athlon are
> CK> > quite similar architectures as far as ATLAS is concerned).
> CK> 
> CK> No, I compiled the ATLAS libraries as well as the LAPACK libraries using 
> CK> the new IFC (see attached make.inc files).  I am very puzzled now.
> 
> there are a few subtleties involved here. ATLAS does ship with a
> partial LAPACK library containing some functions, that already take
> direct advantage of the ATLAS tuning infrastructure. but to get a
> full LAPACK, you have to _merge_ it with a standard LAPACK. this is
> not as bad in terms of performance, as it initially may sound, since
> most of the speed of LAPACK originates in the fact, that it uses
> BLAS extensively, and so with a well optimized BLAS (as ATLAS provides
> it), you'll get a good performance. if you just have two separate
> lapack libraries, then you'll either get a suboptimal performance,
> when you give the standard lapack library first, since you skip the
> tuned parts from ATLAS, or you're missing some parts.
> 
> your linker sequence and the error message, however, suggest that
> you are just using the minimal LAPACK bundled with ATLAS. in that
> case, there seems to be a problem with the compilation. as the c-compiler
> part, seems to be using the g77 conventions, but the fortran part
> is using different conventions. my suggestion is to compile ATLAS
> with g77. the fortran compiler has _no_ impact on the speed of
> atlas, as the fortran parts are just wrappers and since BLAS and LAPACK
> do not contain any symbols with underscores, the resulting binary
> should be compatible with ifort as well. but please note, that to
> get the best performing ATLAS for CPMD on a (dual-)athlon machine
> you have to be extremely careful. especially, you should not(!) compile
> a multithreaded library. the performance gain is small compared
> to running MPI locally. if at all, it only makes sense in combination
> with an OpenMP compile, but then you should use MKL, as ATLAS cannot
> detect, whether it was called from an OpenMP block and thus will
> always be using two threads, MKL however will only use one thread
> and be thus more efficient. performance of the p3 MKL is roughly
> the same as with a well tuned ATLAS and then there is the issue
> of the (sometimes) inconsistent 3dnow!/fp-register handling on dual 
> athlon machines...
> 
> in case you really did the merge, as it is described in the ATLAS
> docs, then you may have mixed up some g77 compiled library with
> an ifort compiled library, or been reusing flags from a previous
> compile that were directing the c/f77 interface to use g77 conventions.
> the problem obviously is in the c-compiled part of ATLAS, not the
> fortran part. 
> 
> to cut a long story short. the speed differences between the different
> libraries (ATLAS, MKL, ACML, GOTO-dgemm) for any large package program
> are usually not so large. especially on 32-bit x86 platforms with
> the extreme lack of registers, the main speed gain is from optimizing
> memory bandwidth and cache use, this needs about the same strategy
> for most x86 platforms (with the exception of the pentium-4/xeon).
> 
> best regards,
> 	axel.
> 
> [...]
> 



More information about the CPMD-list mailing list