[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] fetchwork vs. claim_worklife



Hi

sorry for the delay

On Tuesday 12 April 2011 20:15:16 Dan Bradley wrote:
> > Version 7.4.4
> > But I was not aware that preemption is needed to claim an idle slot
> 
> The logs you posted showed the slot transitioning to Claimed/Idle, not
> Unclaimed/Idle.  Therefore, the work-fetch job must preempt the claim of
> the schedd that is holding it.  I can't think of any reason why the
> schedd would hold the claim after a job completes without starting
> another job for an hour other than the schedd being very very busy.
> Perhaps it would be worth looking into what exactly is going on with
> that.  One place to start would be the shadow log.  Look at the shadow
> that ran the job that ran on the claim before it transitioned to
> Claimed/Idle for a long period of time.  Did the shadow exit cleanly?
> In the schedd log, can you see the schedd handling the exit of that
> shadow?  It should immediately launch another job on the claim at that
> point.

As far as I could see at the time, the schedd was only loaded moderately, but 
the same user had more jobs in the queue but all of which had memory 
requirements beyond the ones for that machine. Cold that be a reason - I guess 
I could try to cross check that claim as well - so for now this is just a 
guess.

Cheers

Carsten