[CPMD-list] help for low cpu efficiency

=?gb2312?B?wu3J0NLl?= shyma at imr.ac.cn
Mon Jun 13 11:16:58 CEST 2005


Dear Axel:
     
     Thanks for your last resply for my question about low cpu efficiency in the computation! Now I will describe my small cluster in details and I think it would help you locate the problem!
     My NFS NIS cluster constist of 5 P4 PC machines(single processor), tf0 is the master node and it share its home directory to other slave nodes tf1 tf2 tf3 tf4. And I link the directory /usr/local to /home directory so the user can can install the program to the /usr/local in anynode! The network cards is Realtek RTL8319 Family PCI Fast Ethernat NIC(100 Mbps) and the switch is 3C16980A (10/100 Mbps,24 ports),no hub. I install the pgi5.2 under the directory of /usr/local on the master node and compile the lam-7.1.1 with pgf90 under the /usr/local/ directory, the user can use them on anynode. Then I get the cpmd.x exectallbe with the configre file PC-PGI-MPI. Lastly, I copy the cpmd.x to every node under the /bin.(Need I copy it to every node? It didn't run if I just copy it to master node under the share directory /usr/local/bin and export it path. But why I needn't copy pgi and to evergy nose? )
    My cluster is just like what I describe above! But I find its cpu efficiency is very low in the lam parallel enviroment. The cpu time is equal to elapsed time when I use the single machine to run cpmd.x (the input file is cpmd-test file al001geo.inp). But if I use two or three machines to run it, the elapsed time is about 2 or 3 times than cpu time. The time part is like this:
single machine:
****************************************************************
 *                                                              *
 *                            TIMING                            *
 *                                                              *
 ****************************************************************
 SUBROUTINE            CALLS         CPU TIME        ELAPSED TIME
   S_INVFFT           359172          1431.84             1442.89
    S_FWFFT           322518          1352.48             1362.99
    FFT-G/S          2049258          1241.91             1242.14
    EHPSI_C            27459           303.83              303.93
      EVPSI           248918           220.83              218.67
   OVLAP2_C            28289           124.48              181.66
      VBETA             2930           122.21              121.11
 FRIESNER_C             2930            97.21               96.76
     FFTCOM           683784            64.48               66.20
   RHOOFR_C              294            59.46               59.77
    OVLAP_H             6060            19.88               19.96
      RGS_C             2958            18.92               23.18
 ----------------------------------------------------------------
 TOTAL TIME                           5057.54             5139.27
 ****************************************************************


tf0 and tf1 nodes:
****************************************************************
 *                                                              *
 *                            TIMING                            *
 *                                                              *
 ****************************************************************
 SUBROUTINE            CALLS         CPU TIME        ELAPSED TIME
   S_INVFFT           359172           713.14              731.76
    S_FWFFT           322518           682.59              700.11
    FFT-G/S          2049258           348.21              358.42
    EHPSI_C            27459           133.98              134.44
      EVPSI           248918            90.77               92.18
     FFTCOM           683784            63.16             1772.80
   OVLAP2_C            28289            55.99               84.66
      VBETA             2930            49.19               49.31
 FRIESNER_C             2930            48.24               51.46
   RHOOFR_C              294            21.82               22.30
      RGS_C             2958             9.75               12.25
    OVLAP_H             6060             8.02                8.57
 ----------------------------------------------------------------
 TOTAL TIME                           2224.86             4018.26
 ****************************************************************


tf0 tf1 and tf2 nodes
 ****************************************************************
 *                                                              *
 *                            TIMING                            *
 *                                                              *
 ****************************************************************
 SUBROUTINE            CALLS         CPU TIME        ELAPSED TIME
   S_INVFFT           359175           377.80              396.70
    S_FWFFT           322521           362.50              369.35
    FFT-G/S          2049276           179.85              186.67
    EHPSI_C            27459            90.71               94.63
     FFTCOM           683790            67.74             2216.95
   OVLAP2_C            28289            33.68               51.80
 FRIESNER_C             2930            33.47               40.22
      EVPSI           248921            33.09               34.26
      VBETA             2930            25.33               25.35
   RHOOFR_C              294            13.19               12.74
     GLOSUM           402851             9.11              113.39
      RGS_C             2958             6.46                8.29
    OVLAP_H             6060             5.52                5.57
 ----------------------------------------------------------------
 TOTAL TIME                           1238.45             3555.94

And you can find that the FFTCOM and GLOSUM parts(what's the meaning of them?) cost much elapsed time! So how and what should I do if I want to improve the cpu efficiency and reduce the elapsed time? Any ideas will be appreciated!

Thanks in advance !
Best wishes!
                                                                                shyma
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://cpmd.org/pipermail/cpmd-list/attachments/20050613/6b632d60/attachment.html 


More information about the CPMD-list mailing list