[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] about negotiator and scheduler



On Fri, Feb 18, 2005 at 02:52:54PM +0800, Carson Hung wrote:
> Hi
> 
> I would like to know more details about the working of negotiator
> actually. In each negotiation cycle, the machines and users' jobs are
> matched in the scheduler, right?
> If they match in the scheduler, but the machines rejected the works, will
> the jobs remained in the scheduler?
> 

What do you mean "rejected the works"? If a job is not matched in a 
negotiation cycle, then it stays in the queue.

> I have this question as I would like to know what will happen if there are
> more than one scheduler in the grids site or clusters. and more than 1
> schedulers match the jobs to a certain machine or grids.
> 

What do you mean by 'scheduler'? "Scheduling" in condor is really split
between two daemons.

1. The condor_schedd - this daemon maintains the job queue. If it is 
told _exactly_ where to run the job (ie with Condor-G, with 
globusscheduler = some.machine.com/jobmanager), it can immediately 
start running the job.

2. The condor_negotiatior (and to an extent, the condor_collector) - this
daemon is the "matchmaker" - and it sort of is the "scheduler" for a pool.
There is only one negotiator per pool. 

You really can't ask about the "schedulers" in Condor, because it's a vague
notion.

> One more question about machine definition in condor, it will only match
> one job to a cluster at each negotiaton cycle, right? How's the case for
> condor-g?
> 

A "cluster" in Condor means a collection of jobs in the schedd (ie a
"cluster of 100 jobs, from 'queue 100' in a submit file). Condor will
keep matching jobs in that cluster so long as it can find resources 
for it. There is no difference for matchmaking in Condor or Condor-G, 
in terms of matching per cluster. 

The one real difference between matchmaking in Condor and Condor-G is 
that there is no claiming protocol in Condor-G. In regular Condor, when
we match a job to a resource, the ad for the resource gets removed from
the collector, and we don't match it to a different job in the next
negotiation cycle. Since there's no claiming in Condor-G, if you're 
not careful you can match the same resource many times, if you can't
get rid of the ad in the condor_collector for that resource.

-Erik