[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Avoiding CPU wastage



Hello Michael,

Thanks for your response.Â

I saw your example at this thread [1]. Referred your and another thread [2] in hope to make the things work for me but unfortunately still I am not managed to put together required configuration. Yes, Remotewallclocktime is only updated once the job changed it's state example: running to hold or any other state transition. It's also accumulative parameter.Â

If I use your example as it's then job is going into hold status with proper message with in few seconds of submission which is expected.Â

Also submit_expr always gives this error to me while submitting the job.

$ condor_submit sleep.subÂ
Submitting job(s).
1 job(s) submitted to cluster 2633.
WARNING: the line 'SUBMIT_EXPRS =Â TotalExecutingTime RemoteCpuUtilizationPercent RemoteUserCpuUtilizationPercent RemoteSysCpuUtilizationPercent' was unused by condor_submit. Is it a typo?

Added the 300s time window in your example so that job should at-least run for 300s before evaluating the condition.Â

RunningTime = (CurrentTime - JobCurrentStartDate)
TotalExecutingTime = \
 ( ifThenElse(! isUndefined(RemoteWallClockTime), \
    RemoteWallClockTime, 0) - \
  ifThenElse(! isUndefined(CumulativeSuspensionTime), \
    CumulativeSuspensionTime, 0) \
 ) + \
 ( ifThenElse(JobStatus == 2 && $(RunningTime) > 300, \
    $(RunningTime), 0) \
 ) + \
 ( ifThenElse(JobStatus == 7, \
    LastSuspensionTime - JobCurrentStartDate, 0) \
 )
RemoteCpuUtilizationPercent = \
 ifThenElse(! isUndefined($(TotalExecutingTime)) && $(TotalExecutingTime) > 300, \
  ((RemoteSysCpu + RemoteUserCpu) / RequestCpus) / $(TotalExecutingTime) * 100, \
  ÂUNDEFINED)
periodic_hold = ($(RemoteCpuUtilizationPercent) < 20)

But the job is going into hold status as the condition evaluates to undefined. I was not expecting that it should have only evaluated the condition when the job is running for more than 300s but it put the job on hold after 10 sec only. I guess this comes back to same discussion on this thread in which Collin Mehring mentioned about the late evaluation of _expression_.

$ condor_q 2632.0 -af holdreason
The job attribute PeriodicHold _expression_ '( ifThenElse( !isUndefined(( ifThenElse( !isUndefined(RemoteWallClockTime),RemoteWallClockTime,0) - ifThenElse( !isUndefined(CumulativeSuspensionTime),CumulativeSuspen$
ionTime,0) ) + ( ifThenElse(JobStatus == 2,CurrentTime - JobCurrentStartDate,0) ) + ( ifThenElse(JobStatus == 7,LastSuspensionTime - JobCurrentStartDate,0) )) && ( ifThenElse( !isUndefined(RemoteWallClockTime),$
emoteWallClockTime,0) - ifThenElse( !isUndefined(CumulativeSuspensionTime),CumulativeSuspensionTime,0) ) + ( ifThenElse(JobStatus == 2,CurrentTime - JobCurrentStartDate,0) ) + ( ifThenElse(JobStatus == 7,LastSu$
pensionTime - JobCurrentStartDate,0) ) > 300,( ( RemoteSysCpu + RemoteUserCpu ) / RequestCpus ) / ( ifThenElse( !isUndefined(RemoteWallClockTime),RemoteWallClockTime,0) - ifThenElse( !isUndefined(CumulativeSusp$
nsionTime),CumulativeSuspensionTime,0) ) + ( ifThenElse(JobStatus == 2,CurrentTime - JobCurrentStartDate,0) ) + ( ifThenElse(JobStatus == 7,LastSuspensionTime - JobCurrentStartDate,0) ) * 100,undefined) < 20 )'Â
evaluated to UNDEFINED

$ condor_q 2636.0 -af RemoteWallClockTime CumulativeSuspensionTime JobCurrentStartDate LastSuspensionTime
10.0 0 1558087033 0

Still i am not sure why the example which I shared in my previous comment is not working as expected why it's keep on running more than the list of sleep until PERIODIC_RELEASE evaluates to FALSE.Â

Sorry I have nothing to share about the reason of using specific version atm.Â


Thanks & Regards,
Vikrant Aggarwal


On Thu, May 16, 2019 at 10:47 PM Michael Pelletier <Michael.V.Pelletier@xxxxxxxxxxxx> wrote:
Iâm curious as to why youâre using a development release, rather than a stable release. Is there a feature you need in 8.5?

Michael V. Pelletier
Information Technology
Digital Transformation & Innovation
Integrated Defense Systems
Raytheon Company

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Vikrant Aggarwal
Sent: Wednesday, May 15, 2019 3:22 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [External] Re: [HTCondor-users] Avoiding CPU wastage

Team,

Any inputs to make this work on 8.5.8?




_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/