Personal tools
You are here: Home The Code Performance - Scale out data Exact Exchange

Exact Exchange

Gamma Point Parallel Implementation. Please, note that the new implementation will be available in the forthcoming CPMD release (available from version 3.17.1).

New Parallel Implementation

 

Test Case 4 - Equation

 

Including the corrections reported in:


F. Gygi and A. Baldereschi  Phys Rev Lett  62, 2160  (1989)   
P. Broqvist, A. Alkauskas, and A.  Pasquarello  Phys. Rev. B 80, 085114 (2009)  

 

 

We have implemented a new taskgroup Strategy (V. Weber, T. Laino, A. Curioni, IPDPS 2014 - available here with the permission of the authors), in which we distribute at the same time States and Orbital Couples. 

The cost of the Exact  Exchange scales as N2 M logM, where N is the number of states and M the PW mesh.
 
In the new parallel scheme we exploit:

  • that each group computes a subset of the orbital (non redundant) pairs
  • cyclic distribution of the pairs (scalapack like)
  • that the X-energy and the X-contribution to the electronic gradient are summed/redistributed at the end of the computation (inter groups communication)
  • possible thresolding via orbital localization and overlap densities estimation

 

Li/Air Batteries ( approximately 700 atoms ) - PBE0 (SCF performance)

 
Test Case 5
 
 

Li/Air Batteries ( approximately 1500 atoms ) - PBE0 (MD performance)

 
 
Test Case 6
 

Strong scalability run (up to 96 BG/Q racks or 6 Millions threads). 

Numbers on markers depict run time per MD step 

 
 

32 Water Molecules

XC: Scale out and Performance
without thresholding we can achieve up to 60 ps per week.
 
 

Amorphous SiH - HSE (50 Ry)

 
Amorphous Silicon
 
Performance and Scale-out
 
Best Time per step is approximately 30 seconds.
 
 

Effect of Thresholding

 
XC: Thresholding
Document Actions
« October 2018 »
October
MoTuWeThFrSaSu
1234567
891011121314
15161718192021
22232425262728
293031