[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] DedicatedScheduler hogging resources



>
> It will hold onto claims for UNUSED_CLAIM_TIMEOUT seconds after the job
> leaves the queue, where UNUSED_CLAIM_TIMEOUT is a parameter in the
> condor_config file.  The default is 300 seconds, and you can lower this
> as you like.
>

yes, but it was still holding on to those resources well after the unused
claim timeout.

>> Also, my jobs sometimes try to start even when DedicatedScheduler
>> doesn't
>> have enough resources for them. This causes infinite looping of
>> unsuccessful job execution, meaning that all the resource time gets
>> wasted. For example, my job requests 8 machines, but only 7 are
>> available.
>> Somehow, Condor tries to execute the job anyway, but because there
>> aren't
>> enough resources, it doesn't run. Solutions?
>
> Does the job try to start, or do the machines just get claimed?

the job actually attempts to start. The ownership of the resources is
given from DedicatedScheduler to the user, and the log file indicates that
the job started running. It fails to start, so the resources are given
back to DedicatedScheduler, which again gives them to the user and the job
again fails to start and so on.

Rok