[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Weird behavior with User Priority and preemption



Hi GaÃtan,

I suspect the problem here is that you've set the
PREEMPTION_REQUIREMENTS to simply True, instead of an expression that
evaluates to True. Try setting the following in your configuration
instead:

PREEMPTION_REQUIREMENTS = (RemoteUserPrio > TARGET.SubmitterUserPrio * 1.2)

If that doesn't work, you could also try looking at the NegotiatorLog
file in your log directory to see if there are any hints in there?
Feel free to post it here if you see anything that looks suspicious.

Mark


On Wed, Sep 22, 2021 at 11:03 AM Gaetan Geffroy <gage@xxxxxxxxx> wrote:
>
> Hi,
>
> I am new to Condor and currently playing around with MiniCondor using the htcondor/mini image from Docker Hub.
>
> I want to do some experiments with the preemption mechanism. I quote the documentation:
>
> When considering user priorities, the negotiator will not preempt a job running on a given machine unless the PREEMPTION_REQUIREMENTS expression evaluates to True and the owner of the idle job has a better priority than the owner of the running job.
>
> So the first thing I did was to go in the configuration file, set PREEMPTION_REQUIREMENTS to True and run condor_reconfig. Then, I created a new user called priouser and gave it a Real User Priority of 1.0 using condor_userprio -setprio. I then changed the priority of the existing submituser user to 1.000.000.
>
> My MiniCondor image can only run 4 jobs at a time, so I submit 4 jobs with submituser (simply sleeping for 60s and exiting), then I submit another one with priouser.
>
> The expected behavior is for Condor to preempt one of the jobs from submituser and run the one from priouser first. And at first, that is what happens: the first job from submituser goes back to IDLE and the one from priouser starts running. But after 5 seconds the job from priouser goes to IDLE and the one that was stopped restarts... For 5 seconds before being stopped again to let the other run and so on until there are available slots for them to run at the same time. While there are no available slots, these two jobs keep killing each other every 5 seconds.
>
> However, when I submit the job with priouser first and then the 4 jobs with submituser, the last job from submituser politely waits for the prio one to be over to start running.
>
> What is causing the prio job to stop 5 seconds after preempting a lower priority job ?
>
> I also posted this question on StackOverflow with screenshots: https://stackoverflow.com/questions/69190475/htcondor-preempting-jobs-of-higher-priority-user-to-run-lower-priority-user-jobs
>
> Thanks,
>
> GaÃtan Geffroy
>
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/



--
Mark Coatsworth
Systems Programmer
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin-Madison