[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Shadow Exception



http://www.cs.wisc.edu/condor/manual/v7.5/3_7Networking_includes.html#30558


On Thu, 2010-07-22 at 11:56 -0400, Natarajan, Senthil wrote:
> Hi,
> 
> I recently added a Linux machine which has 24 slots (and hence can run
> 24 jobs) to our condor pool. 
> 
> After around 8 jobs started running on the Linux machine, Central
> Manager could not send any more jobs, and jobs are failing with Shadow
> Exception.
> 
> Can no longer talk to condor_starter <xxx.xxx.xxx.xxx:9604>
> 
> Error from slot8@xxxxxxxxxxxxxxxx: Could not initiate file transfer
> 
> Linux machine is behind the firewall and we have opened the firewall
> for the below port range.
> 
> HIGHPORT=9620
> 
> LOWPORT=9600
> 
>  
> 
> Looks like the condor_starter which is using one of the ports from the
> above range, in this case port 9604, could not handle more than 8
> connections and hence could not run more than 8 jobs seems.
> 
>  
> 
> Can anyone suggest how this can be fixed so that jobs can be run on
> all the 24 slots.
> 
>  
> 
> Thanks,
> 
> Senthil
> 
>  
> 
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/