[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Change the HoldReason for a PeriodicHold expression



Hi Carles,

If your PeriodicHold expression is not evaluating the way you expect,
best first step is to use the debug() function to get additional
information about it. To do this, just wrap your expression with the
function, for example:

set_HoldReason = debug(IfThenElse(ResidentSetSize >
JobMemoryLimit*1024,HoldMemoryReason,undefined)))

In order for the debug output to appear, you'll also need to set your
debug level to D_FULLDEBUG for the daemons involved. I'm pretty sure
the ones you're looking for are:

SCHEDD_DEBUG = D_FULLDEBUG
SHADOW_DEBUG = D_FULLDEBUG

At this point, you should be able to go into your log folder and run a
`grep -rn "Classad debug:"` to see a bunch of debug output. This
should hopefully reveal why the expressions are not doing what you
want.

Lastly, there are a couple of minor syntax errors in the expressions
you sent us, but I assume that's just because you transcribed them
loosely into this email. If the above suggestions don't help, please
send the classad expressions exactly as they appear in your schedd ad
so we can see if there's anything obviously wrong.

Mark

On Sat, Oct 17, 2020 at 10:51 AM Carles Acosta <cacosta@xxxxxx> wrote:
>
> Dear all,
>
> We are running HTCondor 8.8.10 and we have a PeriodicHold expression added to all the jobs by JobTransform that is something like this:
>
> set_HoldMemory = (JobStatus == 2 && ifThenElse(ResidentSetSize isnt undefined, ResidentSetSize > JobMemoryLimit*1024, false));
> set_PeriodicHold = HoldMemory;
>
> I want to create a HoldReason accompanying this PeriodicHoldExpression:
>
> set_HoldMemoryReason = strcat("Your job memory ", ResidentSetSize/1024, "MB, exceeded the Job Memory Limit", JobMemoryLimit, "MB");
> set_HoldReason =fThenElse(ResidentSetSize > JobMemoryLimit*1024,HoldMemoryReason,undefined));
>
> However, this is not working... any time a job is correctly held due to the PeriodicHold expression is true the HoldReason is:
>
> The job attribute PeriodicHold expression 'HoldMemory' evaluated to TRUE
>
> Am I missing something? Is there any way to change the Hold Reason for a Periodic Hold expression?
>
> Thank you in advance.
>
> Best regards,
>
> Carles
> --
> Carles Acosta i Silva
> PIC (Port d'Informacià CientÃfica)
> Campus UAB, Edifici D
> E-08193 Bellaterra, Barcelona
> Tel: +34 93 581 33 08
> Fax: +34 93 581 41 10
> http://www.pic.es
> AvÃs - Aviso - Legal Notice:  http://legal.ifae.es
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/



-- 
Mark Coatsworth
Systems Programmer
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin-Madison