[CPMD-list] mpirun can't run

Reuti reuti at staff.uni-marburg.de
Fri Feb 25 11:25:28 CET 2005


Hi,

you are using LAM/MPI, which is a daemon based solution. Hence the started 
programs will be children of lamd and do the work. Maybe you will see more, 
when you use:

ps -e f -o pid,ppid,pgrp,user,time,command --cols=120

(space between -e and f) Is your program a child of lamd? The mpirun shouldn't 
consume much time, this is okay. But the cpmd.x should use the CPU - this is 
also not the case?

BTW: There is also a "one-shot-mpirun", doing lamboot/mpirun/lamhalt in just 
one command: mpiexec (page 57 in the manual).

Cheers - Reuti


Quoting s101bayu at mail.chem.itb.ac.id:

> dear all,
> 
> i have installed Cluster (OSCAR Cluster), and has compiled LAM with Intel
> Fortran Compiler. my installation OSCAR Cluster and recompile LAM with
> Intel Fortran Compiler, installation CPMD with mpif77 compiler has
> success.
> 
> but, when i tried running mpitun C cpmd.x input > output, this is can't
> run.
> my step in running is :
> # recon -v machinefile
> 
> the message is :
> n-1<25478> ssi:boot:base:linear: booting n0 (167.205.72.13)
> n-1<25478> ssi:boot:base:linear: booting n1 (167.205.72.9)
> n-1<25478> ssi:boot:base:linear: booting n2 (167.205.72.10)
> n-1<25478> ssi:boot:base:linear: booting n3 (167.205.72.11)
> n-1<25478> ssi:boot:base:linear: finished
> -----------------------------------------------------------------------------
> Woo hoo!
> 
> recon has completed successfully.  This means that you will most likely
> be able to boot LAM successfully with the "lamboot" command (but this
> is not a guarantee).  See the lamboot(1) manual page for more
> information on the lamboot command.
> 
> If you have problems booting LAM (with lamboot) even though recon
> worked successfully, enable the "-d" option to lamboot to examine each
> step of lamboot and see what fails.  Most situations where recon
> succeeds and lamboot fails have to do with the hboot(1) command (that
> lamboot invokes on each host in the hostfile).
> -----------------------------------------------------------------------------
> 
> # lamboot -v machinefile
> 
> the message is :
> 
> LAM 7.0.6/MPI 2 C++/ROMIO - Indiana University
> 
> n-1<25507> ssi:boot:base:linear: booting n0 (167.205.72.13)
> n-1<25507> ssi:boot:base:linear: booting n1 (167.205.72.9)
> n-1<25507> ssi:boot:base:linear: booting n2 (167.205.72.10)
> n-1<25507> ssi:boot:base:linear: booting n3 (167.205.72.11)
> n-1<25507> ssi:boot:base:linear: finished
> 
> # mpirun C cpmd.x cpmd-init-wave-1.in > cpmd-init-wave-1.out &
> 
> at this point nothing error message.
> but, i think this mpirun not running, because the time mpirun is not
> running.
> [bayu at kimia-13 tutorial]$ ps
>   PID TTY          TIME CMD
> 25390 pts/2    00:00:00 bash
> 25991 pts/2    00:00:00 mpirun
> 25993 pts/2    00:00:00 ps
> 
> after a few minuts this time for mpirun not change
> 
> [bayu at kimia-13 tutorial]$ ps
>   PID TTY          TIME CMD
> 25390 pts/2    00:00:00 bash
> 25991 pts/2    00:00:00 mpirun
> 25993 pts/2    00:00:00 ps
> 
> [bayu at kimia-13 tutorial]$ ps -ef | grep cpmd.x
> bayu     25991 25390  0 04:08 pts/2    00:00:00 mpirun C cpmd.x
> cpmd-init-wave-1.in
> bayu     25992 25510  0 04:08 ?        00:00:00 cpmd.x cpmd-init-wave-1.in
> bayu     25998 25390  0 04:09 pts/2    00:00:00 grep cpmd.x
> 
> i don't know what is the problem.
> my computer is pentium 4 with operating system Linux Redhat 9.
> the interface in computer server is graphical, but in the client is text
> interface.
> 
> any can help me?
> 
> thanks,
> bayu
> 
> _______________________________________________
> CPMD-list mailing list
> CPMD-list at cpmd.org
> http://cpmd.org/mailman/listinfo/cpmd-list
> 





More information about the CPMD-list mailing list