Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] fetchwork vs. claim_worklife

Date: Tue, 12 Apr 2011 13:15:16 -0500
From: Dan Bradley <dan@xxxxxxxxxxxx>
Subject: Re: [Condor-users] fetchwork vs. claim_worklife



On 4/12/11 12:46 PM, Carsten Aulbert wrote:

Hi Dan

On Tuesday 12 April 2011 16:58:52 Dan Bradley wrote:

I am puzzled about why preemption is ineffective in the case where the
work-fetch job has higher rank than the existing claim.  What version of
condor is this?

Version 7.4.4
But I was not aware that preemption is needed to claim an idle slot

The logs you posted showed the slot transitioning to Claimed/Idle, notUnclaimed/Idle. Therefore, the work-fetch job must preempt the claim ofthe schedd that is holding it. I can't think of any reason why theschedd would hold the claim after a job completes without startinganother job for an hour other than the schedd being very very busy.Perhaps it would be worth looking into what exactly is going on withthat. One place to start would be the shadow log. Look at the shadowthat ran the job that ran on the claim before it transitioned toClaimed/Idle for a long period of time. Did the shadow exit cleanly?In the schedd log, can you see the schedd handling the exit of thatshadow? It should immediately launch another job on the claim at thatpoint.

I am also curious why claims are sitting in Claimed/Idle for so long
after a job finishes.  Is the schedd severely overloaded?

Not really - as far as I can tell, busy as usual with<  ~50% CPU time on a
single node

The schedd is single-threaded. It is possible for the cpu to be notvery busy but for the schedd to be having performance problems due todisk i/o or blocking network communications. Is the schedd responsiveto condor_q queries?


--Dan

Follow-Ups:
- Re: [Condor-users] fetchwork vs. claim_worklife
  - From: Carsten Aulbert

References:
- [Condor-users] fetchwork vs. claim_worklife
  - From: Carsten Aulbert
- Re: [Condor-users] fetchwork vs. claim_worklife
  - From: Carsten Aulbert
- Re: [Condor-users] fetchwork vs. claim_worklife
  - From: Dan Bradley
- Re: [Condor-users] fetchwork vs. claim_worklife
  - From: Carsten Aulbert

Prev by Date: Re: [Condor-users] fetchwork vs. claim_worklife
Next by Date: Re: [Condor-users] efficiency question
Previous by thread: Re: [Condor-users] fetchwork vs. claim_worklife
Next by thread: Re: [Condor-users] fetchwork vs. claim_worklife
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [Condor-users] fetchwork vs. claim_worklife