[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Shadow Exception


I recently added a Linux machine which has 24 slots (and hence can run 24 jobs) to our condor pool.

After around 8 jobs started running on the Linux machine, Central Manager could not send any more jobs, and jobs are failing with Shadow Exception.

Can no longer talk to condor_starter <xxx.xxx.xxx.xxx:9604>

Error from slot8@xxxxxxxxxxxxxxxx: Could not initiate file transfer

Linux machine is behind the firewall and we have opened the firewall for the below port range.




Looks like the condor_starter which is using one of the ports from the above range, in this case port 9604, could not handle more than 8 connections and hence could not run more than 8 jobs seems.


Can anyone suggest how this can be fixed so that jobs can be run on all the 24 slots.