[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Jobs massively killed with PeriodicRemove



Hi,

If I am not wrong, condor_drain is not killing the jobs, it is just gracefully waiting until the MaxVacateTime/MaxJobRetirementtime of the jobs is reached andÂput again the jobs on queueÂif they have not finished. These jobs requeued have already started for that reason NumJobStarts > 0 and your periodic_remove _expression_ remove all these jobs. Do you really need this Periodic Remove _expression_?

Then, my guess is that if you have a MaxJobRetirementtime long enough during the drain, you will leave enough time for the jobs to finish and not be returned again to the queue.Â

Cheers,

Carles

On Thu, 24 Mar 2022 at 11:50, Edith Knoops <knoops@xxxxxxxxxxxxx> wrote:
On 3/24/22 10:23, Beyer, Christoph wrote:
> Hi,

Thanks for your answer.


>
> where did you define the system periodic remove _expression_, it does actually say remove jobs that are idle and have not started yet which is pretty much the definition of an idle job ;) ?

in submit-condor-job on the ARC but this was not changed when it works
with no defrag. And I did not modify it.

I tried to comment all periodic remove and force it at false but with
that nothing was running.

With the actual configuration a lot of jobs are killed but the cluster
is more or less full of running jobs.

And sur I did not want to kill all idle jobs, queue arr usefull ð


>
> This might make sense if you want to lower the idle job queue to 'near to 0' and only accept jobs that more or less start in the same second - still a weird approach this would be :)

That is not what I want


>
> Best
> christoph
>

--
--------------------------------------------------------------
Edith Knoops
CPPM/CNRSÂ Â Â Â Â Â Â Â Â Â Â Â ÂMail: knoops@xxxxxxxxxxxxx
163 Av de Luminy case 902Â Â Â Â ÂTel : (+33) (0)4 91 82 72 02
13288 Marseille Cedex 9 France
--------------------------------------------------------------

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


--
Carles Acosta i Silva
PIC (Port d'Informacià CientÃfica)
Campus UAB, Edifici D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 08
Fax: +34 93 581 41 10
http://www.pic.esÂ
AvÃs - Aviso - Legal Notice: Âhttp://legal.ifae.es