Personal tools
You are here: Home The Code Performance - Scale out data Exact Exchange

Exact Exchange

Gamma Point Parallel Implementation. Please, note that the new implementation will be available in the forthcoming CPMD release (available from version 3.17.1).

New Parallel Implementation


Test Case 4 - Equation


Including the corrections reported in:

F. Gygi and A. Baldereschi  Phys Rev Lett  62, 2160  (1989)   
P. Broqvist, A. Alkauskas, and A.  Pasquarello  Phys. Rev. B 80, 085114 (2009)  



We have implemented a new taskgroup Strategy (V. Weber, T. Laino, A. Curioni, IPDPS 2014 - available here with the permission of the authors), in which we distribute at the same time States and Orbital Couples. 

The cost of the Exact  Exchange scales as N2 M logM, where N is the number of states and M the PW mesh.
In the new parallel scheme we exploit:

  • that each group computes a subset of the orbital (non redundant) pairs
  • cyclic distribution of the pairs (scalapack like)
  • that the X-energy and the X-contribution to the electronic gradient are summed/redistributed at the end of the computation (inter groups communication)
  • possible thresolding via orbital localization and overlap densities estimation


Li/Air Batteries ( approximately 700 atoms ) - PBE0 (SCF performance)

Test Case 5

Li/Air Batteries ( approximately 1500 atoms ) - PBE0 (MD performance)

Test Case 6

Strong scalability run (up to 96 BG/Q racks or 6 Millions threads). 

Numbers on markers depict run time per MD step 


32 Water Molecules

XC: Scale out and Performance
without thresholding we can achieve up to 60 ps per week.

Amorphous SiH - HSE (50 Ry)

Amorphous Silicon
Performance and Scale-out
Best Time per step is approximately 30 seconds.

Effect of Thresholding

XC: Thresholding
Document Actions
« July 2019 »