[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Preemption doubts



Hi Greg, all,

After several weeks, I've discovered why the preemption was not working in that machine...

The machine has one internal disk (with /home partition of 500 GB) and other disks in a software RAID format (called /local partition of 42 TB). If EXECUTE directory is set into the internal disk (/home/execute), everything works as expected. However, if the EXECUTE directory is set into /local/execute then, the preemption is not working!

It is really strange because the jobs are working correctly when EXECUTE is pointing to /local/execute, the temporal directories are correctly created, etc. The only issue is with the preemption...Â

Anyway, to use /home partition and not /local one is not a big deal for us, since the /local directory is thought for other data, but I would like to know if there is an explanation... Do you have any idea about what is happening?

Best regards,

CarlesÂ



On Wed, 9 Dec 2020 at 07:20, Carles Acosta <cacosta@xxxxxx> wrote:
Hi Greg,

Thank you very much. I'm going to try with another Rank _expression_ without the AccountingGroup.

Cheers,

Carles

On Mon, 7 Dec 2020 at 17:51, Greg Thain <gthain@xxxxxxxxxxx> wrote:


Hi Carles:


This preemption configuration should work. One thing to be aware of is that Accounting Group quotas, if you have them, are enforced before startd RANK preemption. That is, if an accounting group is already at the quota limit, startd RANK will not allow the group to go over the quota.

To test this, can you try a test with the RANK _expression_ being some other custom job attribute like "GoFirst", and have a test preemption job set "+GoFirst = true", and the RANK _expression_ be "RANK = GoFirst"Â or something like that?


-greg





On Fri, 27 Nov 2020 at 13:25, Carles Acosta <cacosta@xxxxxx> wrote:
Hi all,

We are running HTCondor 8.8.11 and we are using "Priority preemption" effectively on our farm.ÂÂ

However, if I'm not wrong, preemption can also work using the Startd Rank without consideringÂthe priority of the jobs, am I right? As long as you have NEGOTIATOR_CONSIDER_PREEMPTION to True (we also have ALLOW_PSLOT_PREEMPTION = True). I understand that the PREEMPTION_REQUIREMENTS that we use for our "Priority preemption" has no effect with "Startd Rank preemption".

Thus, for example, one machine with 2 cpus and this config:

RANK = regexp("test",AccountingGroup)
MAXJOBRETIREMENTTIME = 60

It is running 2 jobs that are asking for 1 cpu from an AccountingGroup that does not contain "test". When I submit another one that asks for 1 Cpus for the AccountingGroup that contains "test", the Rank should be better and, then, preempt one job in just 1 minute, if this works as I've understood.Â

But this is not working, the job with AccountingGroup test remains Idle while the other two, that have to be preempted (with Rank false and CurrentRank 0) are still running. I've also checked that when the job with AccountingGroup is running, the Rank is true and CurrentRank 1.0, so, the rank definition seems to be fine.

I'm sure that I'm not doing something correctly and I am not sure if the preemption works as I expected. Someone can help me?

Thank you in advance.

Best regards,

Carles

--
Carles Acosta i Silva
PIC (Port d'Informacià CientÃfica)
Campus UAB, Edifici D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 08
Fax: +34 93 581 41 10
AvÃs - Aviso - Legal Notice: Âhttp://legal.ifae.es


--
Carles Acosta i Silva
PIC (Port d'Informacià CientÃfica)
Campus UAB, Edifici D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 08
Fax: +34 93 581 41 10
AvÃs - Aviso - Legal Notice: Âhttp://legal.ifae.es

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


--
Carles Acosta i Silva
PIC (Port d'Informacià CientÃfica)
Campus UAB, Edifici D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 08
Fax: +34 93 581 41 10
AvÃs - Aviso - Legal Notice: Âhttp://legal.ifae.es


--
Carles Acosta i Silva
PIC (Port d'Informacià CientÃfica)
Campus UAB, Edifici D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 08
Fax: +34 93 581 41 10
http://www.pic.esÂ
AvÃs - Aviso - Legal Notice: Âhttp://legal.ifae.es