[CPMD-list] error in BO-PIMD

qfzhang qfzhang at aphy.iphy.ac.cn
Sun Jun 10 05:32:04 CEST 2007


Hi,
   Thanks for your advice! Can I sovle the problem by rewriting some part of the
 program? I have add some "CALL MY_SYNC(SUPERGROUP)" sentence in pi-diag.F, but 
it seems not to work. And you also mentioned the compiler. Can I solve the probl
em by some change during compiling?
   
   Best wishes
                                                     Qianfan Zhang

Axel Kohlmeyer д:

> On Sat, 9 Jun 2007, qfzhang wrote:
> 
> QZ> Hi,
> QZ>   So sorry for that. It is very strange that when running BO-PIMD job,noth
ing is
> QZ>  written to the output file since "force initialization", and nothing to t
he fil
> QZ> e TRAJECTORY and ENERGY. But the job will not stop until the walltime limi
t, and
> 
> hi,
> this is not strange, you just discovered a \'deadlock\' bug due to 
> a so-called race condition. this can happen with PI-MD, when the
> individual replica take significantly different time to do some
> work yet the code is written in a way that expects about the same
> time spent.
> 
> you don\'t see any output to the files, since you compiler defaults
> to buffered output (inded something is written, but the first MD
> step stalls, at least when trying to reproduce it on my machine).
> 
> QZ>  no error message.But when specify PORCESSOR GROUP=1,no problem.when I use
 CP-PI
> 
> with no processor groups there is no parallelization over replica,
> and it seems that exactly that is causing the problems. with CP-MD
> all operations take about the same time per replica, but with BO-MD
> this is not always the case (different number of WF-opt steps for
> different replica).
> 
> QZ> MD for calculations, no such problem. So I really don\'t know what\'s wron
g with i
> QZ> t.the output file is as below.
> 
> the cause can probably be found or narrowed down by tracing 
> the parallelization in pi_diag.F. 
> 
> please note, that even though your job appears to be working, all
> it does, is checking for the other parts to communicate which are
> waiting for the first nodes in return (=> deadlock).
> 
> cheers,
>    axel.
> 
> [...]
> 
> QZ> > -- 
> QZ> > =======================================================================
> QZ> > Axel Kohlmeyer   akohlmey at cmm.chem.upenn.edu   http://www.cmm.upenn.edu
> QZ> >  Center for Molecular Modeling   --   University of Pennsylvania
> QZ> > Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
> QZ> > tel: 1-215-898-1582,  fax: 1-215-573-6233,  office-tel: 1-215-898-5425
> QZ> > =======================================================================
> QZ> > If you make something idiot-proof, the universe creates a better idiot.
> QZ> 
> QZ> 
> QZ> 
> 
> -- 
> =======================================================================
> Axel Kohlmeyer   akohlmey at cmm.chem.upenn.edu   http://www.cmm.upenn.edu
>    Center for Molecular Modeling   --   University of Pennsylvania
> Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
> tel: 1-215-898-1582,  fax: 1-215-573-6233,  office-tel: 1-215-898-5425
> =======================================================================
> If you make something idiot-proof, the universe creates a better idiot.
> 




More information about the CPMD-list mailing list