[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Exact semantics of Universe = Local? How to have all work done on submit machine?



I have a dozen or so Windows machines in a Condor cluster. I am trying to take one machine in the cluster and have submitted jobs run only locally. I figured I could simply change the universe from "vanilla" to "local" in the submit files and then the locally submitted jobs would queue up and run locally.

Firstly, is my understanding that the Local Universe implies the normal queuing and robustness of the Vanilla Universe correct? Or does it do something more basic like simply launch a lot of processes regardless of slots?

Secondly, and more importantly, nothing happens at all with my Local jobs (as part of a DAG). The first job in the DAG simply sits in the queue and "has not been considered by the match maker." It has runAsOwner = TRUE, if that matters. And the version condor version is a bit old at 7.6.3.

Next I tried to go back to Vanilla universe but add a requirement that the Machine == "the_name_of_the_local_machine". And this also resulted in the job just sitting there unmatched. condor_q -analyze reports that the local slots have rejected the job for their own reasons. Which makes no sense because the machine has been working these jobs as part of the cluster just fine. How do I go about getting a better explanation of why the slots are rejecting the job? Nothing jumped out from any of the log files as an explanation.

I'm about to simply reinstall condor with a local master on the machine and remove it from the cluster, but I feel like this should be unnecessary. Either of the two approaches above should work, right?