[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] parallel universe fails with PREEMPTION_REQUIREMENTS == False



Hi,

hmm, no idea, found one on the net, I think it's this one: 

https://github.com/htcondor/htcondor/blob/master/build/packaging/srpm/condor_config.local.dedicated.resource

[root@bird621 /etc/condor/config.d]# grep -v \# 100dedicated_ressource_wn.conf 
DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxxxxxxxxxxxxxxxx"
SUSPEND	= False
CONTINUE	= True
PREEMPT	= False
KILL		= False
WANT_SUSPEND	= False
WANT_VACATE	= False
RANK		= Scheduler =?= $(DedicatedScheduler)
MPI_CONDOR_RSH_PATH = $(LIBEXEC)
CONDOR_SSHD = /usr/sbin/sshd
CONDOR_SSH_KEYGEN = /usr/bin/ssh-keygen
STARTD_ATTRS = $(STARTD_ATTRS), DedicatedScheduler
START = (NODE_IS_HEALTHY =?= True) && (StartJobs =?= True)

Best
Christoph


-- 
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx

----- UrsprÃngliche Mail -----
Von: "Jason Patton" <jpatton@xxxxxxxxxxx>
An: "htcondor-users" <htcondor-users@xxxxxxxxxxx>
Gesendet: Montag, 5. MÃrz 2018 15:11:32
Betreff: Re: [HTCondor-users] parallel universe fails with PREEMPTION_REQUIREMENTS == False

Christoph,

Are you using one of the pre-defined cases from the
condor_config.local.dedicated.resource example config? If so, which
one?

Jason Patton

On Fri, Mar 2, 2018 at 8:37 AM, Beyer, Christoph
<christoph.beyer@xxxxxxx> wrote:
> Hi everybody,
>
> I guess Oi need a hint :(
>
> I try to run the parallel environment following the example in the documentation and everything looks quite OK to me, the sched knows about the dedicated ressources and gets the slots together he needs.
>
> The negotiator on the other hand though is not happy and rejects the paralle job:
>
> Rejected 855232.0 DedicatedScheduler@xxxxxxxxxxxxxxxxxxxxxxxx <131.169.56.32:9618?addrs=131.169.56.32-9618&noUDP&sock=2854700_55af_3>: PREEMPTION_REQ
> UIREMENTS == False
>
> I did use the example config file for the parallel universe on the workernodes and do not see any other obvious errors/problems, hence I think the overall setup is OK maybe someone can point me in the right direction what the reject message means ?
>
> Best
> Christoph
>
> --
> Christoph Beyer
> DESY Hamburg
> IT-Department
>
> Notkestr. 85
> Building 02b, Room 009
> 22607 Hamburg
>
> phone:+49-(0)40-8998-2317
> mail: christoph.beyer@xxxxxxx
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/