[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Negotiator gets stuck



On Fri February 18 2005 11:01 am, Andrey Kaliazin wrote:
> Thanks Erik,
Andrey,

> I do not doubt the Negotiator's logic in this case it is perfectly valid.
> But I can see that I did
> not explain the problem I have. Let me try again:
<snip>
> So this is the key point of my problem -
> Negotiator quits the cycle immediately after one communication failure.

Do you have more than one schedd?  If all your jobs are from a single schedd, 
then, yes, this is exactly what will happen..  The negotiator gets the job ad 
list from the collector, pulls the first job from it, tries to contact it's 
schedd, fails, and then ignores all jobs from that schedd for the remainder 
of the cycle.  If there are no other schedd's in the pool, that will 
effectively end the negotiation cycle.

Or, is there something else going on?

-Nick

-- 
           <<< The matrix has you. >>>
 /`-_    Nicholas R. LeRoy               The Condor Project
{     }/ http://www.cs.wisc.edu/~nleroy  http://www.cs.wisc.edu/condor
 \    /  nleroy@xxxxxxxxxxx              The University of Wisconsin
 |_*_|   608-265-5761                    Department of Computer Sciences