Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored

Date: Fri, 27 Aug 2021 12:16:32 -0500
From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored

On 8/27/2021 2:42 AM, Stefano Dal Pra wrote:

Experts here might want to confirm: i think that some job classads (such as ResidentSetSize) are actually updated every 15 minutes.
If that is true, that means that this policy could put on hold a job now, based on a value measured up to 15 minutes before.

It is a bit complicated....

The condor_starter on the execute node will send updates to the condor_shadow every 5 minutes by default with dynamic attributes about the job like ResidentSetSize.   How often the starter updates the shadow is controlled via condor_starter config knobs STARTER_UPDATE_INTERVAL and STARTER_INITIAL_UPDATE_INTERVAL (how long until the first update is sent).

Upon receiving an update from the condor_starter, the condor_shadow for the job will evaluate job policy expressions like SYSTEM_PERIODIC_HOLD for running jobs.    Job policy expressions are evaluated/handled by the condor_shadow when a job is running to help offload work from the schedd.

Then, periodically at a lower frequency of every 15 min by default, the condor_shadow will push those updated attributes to the schedd so they are visible via condor_q.   A lower frequency is used here to minimize overloading the schedd when running thousands of jobs. How often the shadow pushes attributes to the schedd is controlled via config knob SHADOW_QUEUE_UPDATE_INTERVAL.

So, even though you will only see changes to ResidentSetSize every 15 minutes via condor_q, the SYSTEM_PERIODIC_HOLD _expression_ should be operating on values that are no more than 5 minutes old.

Hope this helps,
Todd

References:
- [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: David Cohen
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: David Cohen
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: Beyer, Christoph
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: David Cohen
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: Jaime Frey
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: Stefano Dal Pra
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: Jaime Frey
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: David Cohen
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: Stefano Dal Pra
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: David Cohen
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: Stefano Dal Pra
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: David Cohen
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: Stefano Dal Pra
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: Stefano Dal Pra
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: David Cohen
- Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
  - From: Stefano Dal Pra

Prev by Date: Re: [HTCondor-users] Issues with HTCondor 9.0.x MSI installer on Windows
Next by Date: [HTCondor-users] Negotiator only allocating 1 job per machine per cycle
Previous by thread: Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored
Next by thread: [HTCondor-users] upgrading from 8.8.x to 9.0.4 - kerberos auth problems
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored