[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] [HTCondor-Users] NEGOTIATOR_PRE_JOB_RANK is not working as expected



Thanks Todd.

With some use case modification it worked as expected.

On Tue, 13 Aug, 2019, 03:33 Todd Tannenbaum, <tannenba@xxxxxxxxxxx> wrote:
On 8/12/2019 12:58 PM, Vikrant Aggarwal wrote:
> Hello Experts,
>
> Sorry for flurry of questions:
>
> We introduced some high memory nodes in existing condor cluster. I have
> added "highmemory = true" in machine classAD.
>
> Without making any change in submit file I want to steer jobs demanding
> memory greater than certain value to high mem nodes. Prepared this
> _expression_ for it.
>
> DEFAULT_RANK = (10000000 * My.Rank) + (1000000 * (RemoteOwner =?=
> UNDEFINED)) - (100000 * Cpus) - Memory
> NEGOTIATOR_PRE_JOB_RANK = ifthenelse(Target.requestmemory > 1024,
> 10000000 * (highmemory == true), $(default_rank))
>
> First job with 1000 request_memory landing on node without highmemory
> machine classAD
> Second job with 1030 request_memory landing on node with highmemory
> machine classAD.
> *Third job with 1000 request_memory lands on node with highmemory
> machine classAD (not expected).*
>

You say all your high memory nodes have "highmemory = True" in the machine ad.... but do all other nodes have "highmemory = False" ? If not, this could be your problem, as NEGOTIATOR_PRE_JOB_RANK will evaluate to UNDEFINED in that case instead of the number you expected. You could change your clause "highmemory == true" to instead read
"highmemory =?= true". See
 https://htcondor.readthedocs.io/en/v8_9_2/misc-concepts/classad-mechanism.html#_expression_-examples
for an explanation of the =?= and =!= operators.

But additionally, just steering your big memory jobs to you big memory slots is probably not all you want to do... I imagine
you probably also want to explicitly steer your small memory jobs away from your big memory slots.Â

Finally, your NEGOTIATOR_PRE_JOB_RANK ignores all the other goodness in the default _expression_ if the job requests lots of memory, such as preferring a slot that is completely idle (no RemoteOwner) over a slot that is already busy serving someone who would need to be preempted. Not sure if you really intended that or not.Â

I'd suggest trying

NEGOTIATOR_PRE_JOB_RANK = $(NEGOTIATOR_PRE_JOB_RANK) +
  Â1000000 * ((requestmemory > 1024 && highmemory =?= true) || (requestmemory <=1024 && highmemory =!= true))

**Warning** I didn't test the above suggestion, I am just pontificating off the top of my head... hope
I am helping more than I am hurting :)

regards,
Todd





> If the second job is with 1000 of request_memory then all jobs go to the
> same node without highmemory machine classAD.
>
> I didn't find it's related to concept *consumption_policy* because for
> each job condor negotiator cycle is happening.
>
> Without modifying anything in submit file, any other recommended method
> of steering high request_memory jobs to highmem nodes and if the
> resources are not available in highmem nodes then to normal nodes.
>
> Thanks & Regards,
> Vikrant Aggarwal
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
>


--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing ÂDepartment of Computer Sciences
HTCondor Technical Lead        1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132Â Â Â Â Â Â Â Â Â Madison, WI 53706-1685