[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Parallel environment



Thanks Michael,
I took your advice, and I added these lines in all files /etc/condor/config.d/00debconf :

DedicatedScheduler = "DedicatedScheduler@Master"
STARTD_ATTRS = $(STARTD_ATTRS), DedicatedScheduler

The problem is when I try to run my jobs, these remain idle. The log file contains only :

Job submitted from host 192.168.56.101Â

Only the first job is submitted, while the rest isn't submitted. I don't understand where is the problem.

Thanks in advanceÂ

2016-06-02 20:14 GMT+02:00 Michael V Pelletier <Michael.V.Pelletier@xxxxxxxxxxxx>:
From: Francesca Maccarone <dike991@xxxxxxxxx>
Date: 06/02/2016 11:07 AM
Â
> The problem is all jobs of the queue remain idle and they are never
> executed. Because I want to run my job in parallel: what changes should I
> make to get the desired behavior ?


Ciao, Francesca,

Take a look at section 3.12.8 of the 8.4.6 manual. In order for a machine
to match a parallel universe job, it must be advertising the
"DedicatedScheduler" attribute which is set in the configuration and
pushed to the machine ad using the STARTD_ATTRS config.

Once this is set up correctly, you should be good to go. The idea here
is that parallel jobs cannot tolerate having any one of the
parallel processes on any of the machines being terminated unexpectedly,
so machines set up in this way are presumed to prevent eviction and
thus be safe for parallel universe submissions.

    -Michael Pelletier.

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/