[CPMD-list] MEMORY ALLOCATION FAILED at Cray XT3
Axel Kohlmeyer
akohlmey at vitae.cmm.upenn.edu
Thu Aug 24 06:50:43 CEST 2006
On Wed, 23 Aug 2006, Alexandr Isayev wrote:
dear alex,
the behavior you are describing is consistent with other machines.
your job is acutally trying to allocate 1.1GB memory (each word is
8 bytes) when it has less memory left on the heap.
this is a 'feature' of the current implementation of the
atomic guess. after that is created, the memory is freed
again. so as a workaround please try setting INITIALIZE
WAVEFUNCTION RANDOM (especially when you are already
restarting, i.e. not using the atomic guess at all.
for most machines, this 'waste' of memory does not show,
since it only happens during the initialization, but
on the XT3 you have no swap (same as on a BG/L btw),
so your job crashes.
cheers,
axel.
AI> Dear CPMD community:
AI>
AI> I experienced strange cpmd (v3.11.1) behavior at Cray XT3.
AI> *ANY* job that requires more than few hundred Mb of memory fails with
AI>
AI> PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 0]
AI>
AI> However,this particular XT3 has 2gb per CPU. First, I thought that yod
AI> can't allocate required amount of memory by default. I played with
AI> different setting, etc; it does not change anything. I also worked with
AI> local admins. They tested and found no problems with yod, catamount
AI> nodes, etc. It can allocate more than 1.8G with default settings.
AI> I also recompiled the code with default XT3 settings, but it did not
AI> help either.
AI>
AI> This particular example below, just a standard box with 216 waters,
AI> PBE, 80Ry cutoff. It needs about 1Gb per CPU or less.
AI>
AI> Can someone confirms my observations with other XT3s?
AI>
AI> Thank you in advance,
AI> Alexandr
AI>
AI> Relevant output part is attached below:
AI> ==================================================================
AI>
AI> PARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARA
AI> NCPU NGW NHG PLANES GXRAYS HXRAYS ORBITALS Z-PLANES
AI> 0 16665 133357 13 498 1986 54 1
AI> 1 16669 133355 14 498 1986 54 1
AI> 2 16669 133337 13 498 1986 54 1
AI> 3 16669 133333 14 498 1986 54 1
AI> 4 16669 133349 13 498 1986 54 1
AI> 5 16667 133340 14 498 1988 54 1
AI> 6 16665 133340 13 498 1988 54 1
AI> 7 16669 133336 14 498 1988 54 1
AI> 8 16669 133244 13 497 1987 54 1
AI> 9 16669 133318 14 498 1988 54 1
AI> 10 16669 133332 13 498 1988 54 1
AI> 11 16663 133332 14 498 1988 54 1
AI> 12 16661 133338 13 498 1988 54 1
AI> 13 16661 133324 14 498 1988 54 1
AI> 14 16659 133318 13 498 1988 54 1
AI> 15 16656 133322 14 496 1988 54 1
AI> G=0 COMPONENT ON PROCESSOR : 8
AI> PARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARA
AI>
AI> *** LOADPA| CURRENT HEAP USED/FREE 86007/ 1824776 kBytes ***
AI>
AI> OPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPEN
AI> NUMBER OF CPUS PER TASK 1
AI> OPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPEN
AI>
AI> *** RGGEN| CURRENT HEAP USED/FREE 90791/ 1819992 kBytes ***
AI>
AI> ************************** SUPERCELL ***************************
AI> SYMMETRY: SIMPLE CUBIC
AI> LATTICE CONSTANT(a.u.): 35.34045
AI> CELL DIMENSION: 35.3404 1.0000 1.0000 0.0000 0.0000 0.0000
AI> VOLUME(OMEGA IN BOHR^3): 44138.34843
AI> LATTICE VECTOR A1(BOHR): 35.3404 0.0000 0.0000
AI> LATTICE VECTOR A2(BOHR): 0.0000 35.3404 0.0000
AI> LATTICE VECTOR A3(BOHR): 0.0000 0.0000 35.3404
AI> RECIP. LAT. VEC. B1(2Pi/BOHR): 0.0283 0.0000 0.0000
AI> RECIP. LAT. VEC. B2(2Pi/BOHR): 0.0000 0.0283 0.0000
AI> RECIP. LAT. VEC. B3(2Pi/BOHR): 0.0000 0.0000 0.0283
AI> REAL SPACE MESH: 216 216 216
AI> WAVEFUNCTION CUTOFF(RYDBERG): 80.00000
AI> DENSITY CUTOFF(RYDBERG): (DUAL= 4.00) 320.00000
AI> NUMBER OF PLANE WAVES FOR WAVEFUNCTION CUTOFF: 266649
AI> NUMBER OF PLANE WAVES FOR DENSITY CUTOFF: 2133275
AI> ****************************************************************
AI>
AI> *** RINFORCE| CURRENT HEAP USED/FREE 97593/ 1813190 kBytes ***
AI> *** FFTPRP| CURRENT HEAP USED/FREE 116984/ 1793799 kBytes ***
AI>
AI> GENERATE ATOMIC BASIS SET
AI> O SLATER ORBITALS
AI> 2S ALPHA= 2.2458 OCCUPATION= 2.00
AI> 2P ALPHA= 2.2266 OCCUPATION= 4.00
AI> H SLATER ORBITALS
AI> 1S ALPHA= 1.0000 OCCUPATION= 1.00
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> INITIALIZATION TIME: 40.33 SECONDS **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** *********!
****!
AI> ***************************************************
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> PROCESSOR 3 ALLOCATION OF 144020240 WORDS OF MEMORY FAILED PROCESSOR 8 ALLOCATION OF 144020240 WORDS OF MEMORY FAILED PROCESSOR 12 ALLOCATION OF 143951120 WORDS OF MEMORY FAILED PROCESSOR 4 ALLOCATION OF 144020240 WORDS OF MEMORY FAILED PROCESSOR 11 ALLOCATION OF 143968400 WORDS OF MEMORY FAILED PROCESSOR 2 ALLOCATION OF 144020240 WORDS OF MEMORY FAILED PROCESSOR 7 ALLOCATION OF 144020240 WORDS OF MEMORY FAILED PROCESSOR 1 ALLOCATION OF 144020240 WORDS OF MEMORY FAILED PROCESSOR 13 ALLOCATION OF 143951120 WORDS OF MEMORY FAILED PROCESSOR 10 ALLOCATION OF 144020240 WORDS OF MEMORY FAILED PROCESSOR 5 ALLOCATION OF 144002960 WORDS OF MEMORY FAILED PROCESSOR 9 ALLOCATION OF 144020240 WORDS OF MEMORY FAILED PROCESSOR 6 ALLOCATION OF 143985680 WORDS OF MEMORY FAILED PROCESSOR 15 ALLOCATION OF 143907920 WORDS OF MEMORY FAILED PROCESSOR 14 ALLOCATION OF 143933840 WORDS OF MEMORY FAILED
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** **************************************************************** *********!
****!
AI> ***************************************************
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> PROCESSOR 0 ALLOCATION OF 143985680 WORDS OF MEMORY FAILED
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> *** MEMORY| CURRENT HEAP USED/FREE 807592/ 1103191 kBytes ***
AI> *** MEMORY| CURRENT HEAP USED/FREE 807573/ 1103210 kBytes ***
AI> *** MEMORY| CURRENT HEAP USED/FREE 807270/ 1103513 kBytes ***
AI> *** MEMORY| CURRENT HEAP USED/FREE 807595/ 1103188 kBytes ***
AI> *** MEMORY| CURRENT HEAP USED/FREE 807348/ 1103435 kBytes ***
AI> *** MEMORY| CURRENT HEAP USED/FREE 807593/ 1103190 kBytes ***
AI> *** MEMORY| CURRENT HEAP USED/FREE 807589/ 1103194 kBytes ***
AI> *** MEMORY| CURRENT HEAP USED/FREE 807596/ 1103187 kBytes ***
AI> *** MEMORY| CURRENT HEAP USED/FREE 807261/ 1103522 kBytes ***
AI> *** MEMORY| CURRENT HEAP USED/FREE 807592/ 1103191 kBytes ***
AI> *** MEMORY| CURRENT HEAP USED/FREE 807508/ 1103275 kBytes ***
AI> *** MEMORY| CURRENT HEAP USED/FREE 807585/ 1103198 kBytes ***
AI> *** MEMORY| CURRENT HEAP USED/FREE 807431/ 1103351 kBytes ***
AI> *** MEMORY| CURRENT HEAP USED/FREE 807064/ 1103719 kBytes ***
AI> *** MEMORY| CURRENT HEAP USED/FREE 807184/ 1103599 kBytes ***
AI> ****************************************************************
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> *** MEMORY| CURRENT HEAP USED/FREE 809542/ 1101241 kBytes ***
AI> BIG MEMORY ALLOCATIONS BIG MEMORY ALLOCATIONS BIG MEMORY ALLOCATIONS BIG MEMORY ALLOCATIONS BIG MEMORY ALLOCATIONS BIG MEMORY ALLOCATIONS BIG MEMORY ALLOCATIONS BIG MEMORY ALLOCATIONS BIG MEMORY ALLOCATIONS BIG MEMORY ALLOCATIONS BIG MEMORY ALLOCATIONS BIG MEMORY ALLOCATIONS BIG MEMORY ALLOCATIONS BIG MEMORY ALLOCATIONS BIG MEMORY ALLOCATIONS
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> ================================================================ XF 1412670 GK 399999 XF 1412670 GK 399732 XF 1412670 GK 400014 XF 1412670 GK 400047 XF 1412670 GK 399996 XF 1412670 GK 400011 XF 1412670 GK 400008 XF 1412670 GK 400065 XF 1412670 GK 399972 XF 1412670 GK 399996 XF 1412670 GK 400020 XF 1412670 GK 399954 XF 1412670 GK 400020 XF 1412670 GK 399966 XF !
!
AI> 1412670 GK 399954
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> BIG MEMORY ALLOCATIONS INZHP 599999 C2 28804040 INZHP 599599 C2 28804040 INZHP 600022 C2 28790216 INZHP 600071 C2 28804040 INZHP 599995 C2 28793672 INZHP 600017 C2 28804040 INZHP 600013 C2 28804040 INZHP 600098 C2 28804040 INZHP 599959 C2 28790216 INZHP 599995 C2 28804040 INZHP 600031 C2 28800584 INZHP 599932 C2 28804040 INZHP 600031 C2 28797128 INZHP 599950 C2 28781576 INZHP 599932 !
!
AI> C2 28786760
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> XF 1412670 GK 400071 SCG 266666 C0 28804040 SCG 266488 C0 28804040 SCG 266676 C0 28790216 SCG 266698 C0 28804040 SCG 266664 C0 28793672 SCG 266674 C0 28804040 SCG 266672 C0 28804040 SCG 266710 C0 28804040 SCG 266648 C0 28790216 SCG 266664 C0 28804040 SCG 266680 C0 28800584 SCG 266636 C0 28804040 SCG 266680 C0 28797128 SCG 266644 C0 28781576 SCG !
!
AI> 266636 C0 28786760
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> C0 28797128 SCG 266714 VPS 266666 SC0 28804032 VPS 266488 SC0 28804032 VPS 266676 SC0 28790208 VPS 266698 SC0 28804032 VPS 266664 SC0 28793664 VPS 266674 SC0 28804032 VPS 266672 SC0 28804032 VPS 266710 SC0 28804032 VPS 266648 SC0 28790208 VPS 266664 SC0 28804032 VPS 266680 SC0 28800576 VPS 266636 SC0 28804032 VPS 266680 SC0 28797120 VPS 266644 SC0 28781568 VPS !
!
AI> 266636 SC0 28786752
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> INZHP 600107 VPS 266714 RHOPS 266666 YF 1412670 RHOPS 266488 YF 1412670 RHOPS 266676 YF 1412670 RHOPS 266698 YF 1412670 RHOPS 266664 YF 1412670 RHOPS 266674 YF 1412670 RHOPS 266672 YF 1412670 RHOPS 266710 YF 1412670 RHOPS 266648 YF 1412670 RHOPS 266664 YF 1412670 RHOPS 266680 YF 1412670 RHOPS 266636 YF 1412670 RHOPS 266680 YF 1412670 RHOPS 266644 YF 1412670 RHOPS !
!
AI> 266636 YF 1412670
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> YF 1412670 SC0 28797120 ---------------------------------------------------------------- ---------------------------------------------------------------- ---------------------------------------------------------------- ---------------------------------------------------------------- ---------------------------------------------------------------- ---------------------------------------------------------------- ---------------------------------------------------------------- ---------------------------------------------------------------- ---------------------------------------------------------------- ---------------------------------------------------------------- ---------------------------------------------------------------- ---------------------------------------------------------------- ---------------------------------------------------------------- ---------------------------------------------------------------- ---------!
----!
AI> ---------------------------------------------------
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> RHOPS 266714 C2 28797128 [PEAK NUMBER 68] PEAK MEMORY 93409911 = 747.3 MBytes [PEAK NUMBER 68] PEAK MEMORY 93407972 = 747.3 MBytes [PEAK NUMBER 68] PEAK MEMORY 93368568 = 746.9 MBytes [PEAK NUMBER 68] PEAK MEMORY 93410341 = 747.3 MBytes [PEAK NUMBER 68] PEAK MEMORY 93378775 = 747.0 MBytes [PEAK NUMBER 68] PEAK MEMORY 93409999 = 747.3 MBytes [PEAK NUMBER 68] PEAK MEMORY 93409850 = 747.3 MBytes [PEAK NUMBER 68] PEAK MEMORY 93410345 = 747.3 MBytes [PEAK NUMBER 68] PEAK MEMORY 93368090 = 746.9 MBytes [PEAK NUMBER 68] PEAK MEMORY 93409879 = 747.3 MBytes [PEAK NUMBER 68] PEAK MEMORY 93399677 = 747.2 MBytes [PEAK NUMBER 68] PEAK MEMORY 93409576 = 747.3 MBytes [PEAK NUMBER 68] PEAK MEMORY 93389298 = 747.1 MBytes [PEAK NUMBER 68] PEAK MEMORY 93342181 = 746.7 MBytes [PEAK NUM!
BER !
AI> 68] PEAK MEMORY 93357703 = 746.9 MBytes
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> ---------------------------------------------------------------- ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ ================================================================ =========!
====!
AI> ===================================================
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> [PEAK NUMBER 66] PEAK MEMORY 93390512 = 747.1 MBytes
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> ================================================================ PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 3] PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 8] PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 12] PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 4] PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 11] PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 2] PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 7] PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 1] PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 13] PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 10] PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 5] PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 9] PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (!
PME)!
AI> [PROC= 6] PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 15] PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 14]
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI>
AI> PROGRAM STOPS IN SUBROUTINE MEMORY| ALLOCATION FAILED (PME) [PROC= 0]
AI>
AI>
AI>
AI>
AI>
AI>
AI> -------------------------------------------------------
AI> Alexandr Isayev,
AI> Graduate Research Assistant, and System Administrator
AI> @ Computational Center for Molecular Structure
AI> and Interactions (CCMSI),
AI> Jackson State University,
AI> Jackson, MS USA
AI> Tel: +(601) 979-1134
AI> e-mail: alex(at)ccmsi.us
AI> Web: http://www.ccmsi.us
AI> --------------------------------------------------------
AI>
AI> _______________________________________________
AI> CPMD-list mailing list
AI> CPMD-list at cpmd.org
AI> http://cpmd.org/mailman/listinfo/cpmd-list
AI>
--
=======================================================================
Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu http://www.cmm.upenn.edu
Center for Molecular Modeling -- University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.
More information about the CPMD-list
mailing list