[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] does condor_off -peaceful -daemon startd node; works for vanilla jobs?



Hi,

now I can answer some of my questions:
condor_off -peaceful -daemon startd node
works for vanilla jobs if MaxJobRetirementTime was not set to a low value or 0 
(Thanks Todd for explain this).

It is not possible for running jobs to change
MaxJobRetirementTime=0 which was set by niceuser=true with condor_qedit
It is only possible overwriting this value with condor_qedit before the job 
starts.
This means this is no option for very long (days or weeks) running jobs.

It is not possible to avoid that niceuser=true sets
MaxJobRetirementTime=0 by changing condor_config.local.
Not on the scheduler, startd node or condor host.

But at least a user can set 

MaxJobRetirementTime=999999999
in the submit description file and this overwrites MaxJobRetirementTime=0
set by niceuser=true. The order doesn't matter.

Do I missed something?

Best
Harald


On Friday 19 August 2016 15:05:42 Harald van Pee wrote:
> Hi Todd,
> 
> many thanks for your help.
> Now I start understanding, its the niceuser=true statement!
> 
> Unfortunatelly this is our default, beause our standard usage are
> serveral users with thousands of not too long running ( <2h ) jobs from
> each user in a cluster with around 200 cores. They all start with
> niceuser=true and this have the desired effect, that there are allways
> serveral cores free for time critical jobs (or impatient users), which can
> just start as normal user and mostly allways get enough jobs running.
> 
> Unfortunately I have not found how one can set the MaxJobRetirementTime
> for a job with niceuser=true. Have I overlooked something?
> Or does this mean that every user have to set it in the descrition file?
> Or can I set it with condor_qedit?
> Any suggestions?
> 
> best regards
> 
> Harald
> 
<snip>