Exact Exchange

Gamma Point Parallel Implementation. Please, note that the new implementation will be available in the forthcoming CPMD release (available from version 3.17.1).

New Parallel Implementation


Test Case 4 - Equation


Including the corrections reported in:

F. Gygi and A. Baldereschi  Phys Rev Lett  62, 2160  (1989)   
P. Broqvist, A. Alkauskas, and A.  Pasquarello  Phys. Rev. B 80, 085114 (2009)  



We have implemented a new taskgroup Strategy (V. Weber, T. Laino, A. Curioni, IPDPS 2014 - available here with the permission of the authors), in which we distribute at the same time States and Orbital Couples. 

The cost of the Exact  Exchange scales as N2 M logM, where N is the number of states and M the PW mesh.
In the new parallel scheme we exploit:

  • that each group computes a subset of the orbital (non redundant) pairs
  • cyclic distribution of the pairs (scalapack like)
  • that the X-energy and the X-contribution to the electronic gradient are summed/redistributed at the end of the computation (inter groups communication)
  • possible thresolding via orbital localization and overlap densities estimation


Li/Air Batteries ( approximately 700 atoms ) - PBE0 (SCF performance)

Test Case 5

Li/Air Batteries ( approximately 1500 atoms ) - PBE0 (MD performance)

Test Case 6

Strong scalability run (up to 96 BG/Q racks or 6 Millions threads). 

Numbers on markers depict run time per MD step 


32 Water Molecules

XC: Scale out and Performance
without thresholding we can achieve up to 60 ps per week.

Amorphous SiH - HSE (50 Ry)

Amorphous Silicon
Performance and Scale-out
Best Time per step is approximately 30 seconds.

Effect of Thresholding

XC: Thresholding
