next up previous contents
Next: 6.2 Memory requirements Up: 6 Performance issues (PWscf) Previous: 6 Performance issues (PWscf) Contents

6.1 CPU time requirements

The following holds for code pw.x and for non-US PPs. For US PPs there are additional terms to be calculated, that may add from a few percent up to 30-403Nat modes requires a CPU time of the same order of that required by a self-consistent calculation in the same system. For cp.x, the required CPU time of each time step is in the order of the time Th + Torth + Tsub defined below.

The computer time required for the self-consistent solution at fixed ionic positions, Tscf , is:

Tscf = NiterTiter + Tinit

where Niter = niter = number of self-consistency iterations, Titer = CPU time for a single iteration, Tinit = initialization time for a single iteration. Usually Tinit < < NiterTiter .

The time required for a single self-consistency iteration Titer is:

Titer = NkTdiag + Trho + Tscf

where Nk = number of k-points, Tdiag = CPU time per hamiltonian iterative diagonalization, Trho = CPU time for charge density calculation, Tscf = CPU time for Hartree and exchange-correlation potential calculation.

The time for a Hamiltonian iterative diagonalization Tdiag is:

Tdiag = NhTh + Torth + Tsub

where Nh = number of H$ \psi$ products needed by iterative diagonalization, Th = CPU time per H$ \psi$ product, Torth = CPU time for orthonormalization, Tsub = CPU time for subspace diagonalization.

The time Th required for a H$ \psi$ product is

Th = a1MN + a2MN1N2N3log(N1N2N3) + a3MPN.

The first term comes from the kinetic term and is usually much smaller than the others. The second and third terms come respectively from local and nonlocal potential. a1, a2, a3 are prefactors, M = number of valence bands, N = number of plane waves (basis set dimension), N1, N2, N3 = dimensions of the FFT grid for wavefunctions ( N1N2N3 $ \sim$ 8N ), P = number of projectors for PPs (summed on all atoms, on all values of the angular momentum l, and m = 1, . . . , 2l + 1)

The time Torth required by orthonormalization is

Torth = b1NMx2

and the time Tsub required by subspace diagonalization is

Tsub = b2Mx3

where b1 and b2 are prefactors, Mx = number of trial wavefunctions (this will vary between M and a few times M , depending on the algorithm).

The time Trho for the calculation of charge density from wavefunctions is

Trho = c1MNr1Nr2Nr3log(Nr1Nr2Nr3) + c2MNr1Nr2Nr3 + Tus

where c1, c2, c3 are prefactors, Nr1, Nr2, Nr3 = dimensions of the FFT grid for charge density ( Nr1Nr2Nr3 $ \sim$ 8Ng , where Ng > = number of G-vectors for the charge density), and Tus = CPU time required by ultrasoft contribution (if any).

The time Tscf for calculation of potential from charge density is

Tscf = d2Nr1Nr2Nr3 + d3Nr1Nr2Nr3log(Nr1Nr2Nr3)

where d1, d2 are prefactors.


next up previous contents
Next: 6.2 Memory requirements Up: 6 Performance issues (PWscf) Previous: 6 Performance issues (PWscf) Contents
Paolo Giannozzi 2010-04-08