Personal tools
You are here: Home The Code Performance - Scale out data Exact Exchange

Exact Exchange

Gamma Point Parallel Implementation. Please, note that the new implementation will be available in the forthcoming CPMD release (available from version 3.17.1).

New Parallel Implementation

 

Test Case 4 - Equation

 

Including the corrections reported in:


F. Gygi and A. Baldereschi  Phys Rev Lett  62, 2160  (1989)   
P. Broqvist, A. Alkauskas, and A.  Pasquarello  Phys. Rev. B 80, 085114 (2009)  

 

 

We have implemented a new taskgroup Strategy (V. Weber, T. Laino, A. Curioni, IPDPS 2014 - available here with the permission of the authors), in which we distribute at the same time States and Orbital Couples. 

The cost of the Exact  Exchange scales as N2 M logM, where N is the number of states and M the PW mesh.
 
In the new parallel scheme we exploit:

  • that each group computes a subset of the orbital (non redundant) pairs
  • cyclic distribution of the pairs (scalapack like)
  • that the X-energy and the X-contribution to the electronic gradient are summed/redistributed at the end of the computation (inter groups communication)
  • possible thresolding via orbital localization and overlap densities estimation

 

Li/Air Batteries ( approximately 700 atoms ) - PBE0 (SCF performance)

 
Test Case 5
 
 

Li/Air Batteries ( approximately 1500 atoms ) - PBE0 (MD performance)

 
 
Test Case 6
 

Strong scalability run (up to 96 BG/Q racks or 6 Millions threads). 

Numbers on markers depict run time per MD step 

 
 

32 Water Molecules

XC: Scale out and Performance
without thresholding we can achieve up to 60 ps per week.
 
 

Amorphous SiH - HSE (50 Ry)

 
Amorphous Silicon
 
Performance and Scale-out
 
Best Time per step is approximately 30 seconds.
 
 

Effect of Thresholding

 
XC: Thresholding
Document Actions
« September 2017 »
September
MoTuWeThFrSaSu
123
45678910
11121314151617
18192021222324
252627282930