[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] hook_prepare_job JobAd not updating

The HOOK_PREPARE_JOB is invoked after the job has matched to a slot (and after the dynamic slot has been created if there is a p-slot).   Because of this it cannot change attributes that influence the resource allocation of the slot. 

It is intended to allow changes that effect the execution of the job, like adding and removing environment variables.

You should not expect to see changes made by a HOOK_PREPARE_JOB script to be visible with condor_q  since a change that propagated back to the Schedd would affect future executions of the job on other resources. 

If you want to see the changes make to the job for that single execution, the place to look is in the .job.ad file in the job execution sandbox.

If you need to make changes to the job that affect the resource request that the job makes, you need to do that before the job is matched to a slot.   A common way to do this is by using  JOB_TRANSFORM_NAMES in the Schedd configuration to modify the job as it is being submitted.


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Kevin Hrpcek via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Wednesday, December 29, 2021 2:45 PM
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Cc: Kevin Hrpcek <kevin.hrpcek@xxxxxxxxxxxxx>
Subject: [HTCondor-users] hook_prepare_job JobAd not updating
Hey all,

I've been working on a HOOK_PREPARE_JOB script that will edit the job's RequestDisk and RequestMemory as it lands on a node. The general idea is that to try to prolong node ssd life I'm trying to have the node pick whether or not to run the job on the ssd or in /dev/shm based on the disk and memory requirements. If it is small enough it goes to shm while rewriting RequestMem and ssd if the job is big enough. This will allow the job submissions to use a single method of RequestDisk and RequestMemory and not worry about whether or not the job lands on ssd or shm.

The problem I'm having is that the python script is running and able to do what is necessary to shift where job's data is downloaded and runs, but they aren't modifying the JobAds. From what I understand on https://htcondor.readthedocs.io/en/latest/misc-concepts/hooks.html, it is as easy as printing to stdout a key and value. So in python i'm doing a `print('RequestMemory = {0}'.format(newmem))` where the newmem variable is an integer for what I'm trying to update it to. When I look at the job after it is running and the hook has executed with `condor_q -long` the JobAds never update. The memory request is also unchanged when I look at `condor_status node`.

Anyone have a thought on what I'm doing wrong here? I've tried different combinations of writing JobAds. print, sys.stdout.write, quoting differently, trying adding a JobAd instead of modifying.

Expected standard output from the hook

A set of attributes to insert or update into the job ad. For example, changing the Cmd attribute to a quoted string changes the executable to be run."

OS: CentOS 7
Condor version on submitter and startd: condor-8.8.15-1.el7.x86_6

Kevin Hrpcek
Space Science & Engineering Center
University of Wisconsin-Madison