Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] 2 match but reject the job for unknown reasons
- Date: Wed, 9 Dec 2009 20:38:38 -0600 (CST)
- From: Stephen Pietrowicz <srp@xxxxxxxxxxxxx>
- Subject: [Condor-users] 2 match but reject the job for unknown reasons
Hi,
I've run into a problem that I'm trying to debug, but haven't come up with a clue to what might be going wrong.
I've set up the condor binaries on my own cluster, and submit a glide-in request to another system. This works. The nodes show up on my local cluster. I can then send vanilla universe condor jobs to them, and they execute. I can also send simple (one job) DAGs, and the job also executes.
What I haven't been been able to get to work is to get this working under a parallel universe. I've simplified this to the the "sleep" example (with "mydomain.org" pointing at my cluster's site):
universe = parallel
executable = /bin/sleep
arguments = 30
machine_count = 2
Requirements = target.Disk == 0 && TARGET.FileSystemDomain == "mydomain.org"
queue
On the nodes where this would execute, I have the following lines added to the generic "glidein_condor_config" file that comes with the distribution (I put these lines at the bottom of the file):
DEDICATEDSCHEDULER = "DedicatedScheduler@myusername@mylocalnode.mydomain.org"
STARTD_ATTRS = $(STARTD_ATTRS), DEDICATEDSCHEDULER
Everything else is a regular (vanilla - untouched) install, apart from the condor_config.local file changes I had to add to make sure it worked in the first place. I have the DAEMON_LIST set to:
DAEMON_LIST = COLLECTOR, MASTER, NEGOTIATOR, SCHEDD, SHADOW
With all this in place, when the job tries to run, I get the message out of "condor_q -analyze" and "condor_q -better-analyze":
2 match but reject the job for unknown reasons
It appears that I'm missing a configuration parameter somewhere, either locally, or remotely. I've looked through the log files, and haven't seen why the job is being rejected. I've tried setting:
DEDICATEDSCHEDULER = "DedicatedScheduler@xxxxxxxxxxxxxxxxxxxxxxxx"
in the "glidein_condor_config" file on the execute nodes, but that doesn't appear to have made a difference either.
Can someone please point me to a LOG file I should be looking at or let me know a parameter I should be setting?
I would really appreciate the help!
Thanks,
Steve