[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problem when machine_count > 1 in MPI Universe



How did you setup all pre-requisites required for MPI job like dedicated
scheduler(i. e. condor_config.local, condor_config)?

--------- Mensagem Original --------
De: Neeraj Chourasia <neeraj_ch1@xxxxxxxxxxxxxx>
Para: Condor-Users Mail List <condor-users@xxxxxxxxxxx>
Assunto: [Condor-users] Problem when machine_count > 1 in MPI Universe
Data: 09/09/05 05:31

>
> Hi All,
>
> I am trying to run CPI executable that comes with MPI installation on
condor pool of 4 machines. I have setup all pre-requisites required for MPI
job to run like funda of &quot;dedicated scheduler&quot;. But to my
surprise, Job runs well till machine_count is 1. If we increase
machine_count, it fails giving error like
> /*****************************************************/
> rm_3948: (-) net_recv failed for fd = 3
> rm_3948:&nbsp; p4_error: net_recv read, errno = : 104
> /******************************************************/
>
> I have gone through all the previous mails on this particular issue, but
still i am facing the same.
>
> Please help me out
>
> Neeraj
>
>
>
>
>
>
>
>
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
>
https://lists.cs.wisc.edu/mailman/listinfo/condor-users<br><br>
_________________________________________________<br>
E-mail
enviado pelo Webmail da Fesurv<br>
www.fesurv.br - (64) 620.2200 - Rio Verde
- Goiás<br><br>