[CPMD-list] MPI problem

Axel Kohlmeyer axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Tue Apr 27 10:19:21 CEST 2004


On Tue, 27 Apr 2004 latha at sscu.iisc.ernet.in wrote:

L> 
L> hi,
L> 
L> could somebody please help me out with this? 
L> 
L> i am working with quite a huge FCC lattice(20x20x20) which implies that i 
L> am working with the data of 32000 atoms on a parallel machine. why do i 
L> get the following error? 

what is the cpmd output leading up to this?  with such a huge 
system cpmd may run into problems. as i am not sure, if, for example,
all the array dimensions are suitable or the heuristics to calculate 
the size of the scratch arrays still work.

L> 
L> p1_24968: (135.318549) net_recv failed for fd = 3
L> bm_list_3497: (135.625553) wakeup_slave: unable to interrupt slave 0 pid 
L> 3496
L> bm_list_3497: (135.625711) wakeup_slave: unable to interrupt slave 0 pid 
L> 34
L> p2_24786:  p4_error: net_recv read:  probable EOF on socket: 1
L> p1_24968:  p4_error: net_recv read, errno = : 104

this looks like your (ethernet?) network broke down, or one of the 
ethernet card drivers was overloaded or the machine had crashed.
perhaps you are better of directing this kind of technical question 
to the MPICH developers.

regards,
	axel kohlmeyer.

L> 
L> many thanx in advance.
L> 
L> regards,
L> Latha.
L> 

-- 


=======================================================================
Dr. Axel Kohlmeyer                        e-mail: axel.kohlmeyer at rub.de
Lehrstuhl fuer Theoretische Chemie          Phone: ++49 (0)234/32-26673
Ruhr-Universitaet Bochum - NC 03/53         Fax:   ++49 (0)234/32-14045
D-44780 Bochum  http://www.theochem.ruhr-uni-bochum.de/~axel.kohlmeyer/
=======================================================================





More information about the CPMD-list mailing list