[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] DedicatedScheduler hogging resources
- Date: Thu, 09 Mar 2006 14:59:19 -0600
- From: Greg Thain <gthain@xxxxxxxxxxx>
- Subject: Re: [Condor-users] DedicatedScheduler hogging resources
Rok Roskar wrote:
I'm running MPI under condor 6.6:
DedicatedScheduler holds on to resources even after all MPI jobs have been
removed from the queue - any way to fix this? Or is it an unfortunate
byproduct of mixing parallel and serial jobs on the same set of resources?
It will hold onto claims for UNUSED_CLAIM_TIMEOUT seconds after the job
leaves the queue, where UNUSED_CLAIM_TIMEOUT is a parameter in the
condor_config file. The default is 300 seconds, and you can lower this
as you like.
Also, my jobs sometimes try to start even when DedicatedScheduler doesn't
have enough resources for them. This causes infinite looping of
unsuccessful job execution, meaning that all the resource time gets
wasted. For example, my job requests 8 machines, but only 7 are available.
Somehow, Condor tries to execute the job anyway, but because there aren't
enough resources, it doesn't run. Solutions?
Does the job try to start, or do the machines just get claimed?