[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] submit-only host won't work with Windows



Sometime when i submit Windows jobs, they just stay there for a while before condor notices them. Usually about 5 minutes but often my jobs are picked up instantly.

try to use condor_reschedule on the queue of the machine which has the stalled jobs to see if that makes them
processed by condor.


JW




Dr Ian C. Smith wrote:


Hi,

I'm having a great deal of difficulty trying to set up
a submit only host to submit jobs to Windows execution
hosts. It seems to work fine when the jobs are submitted
to Linux machines but the job just stays in the queue
with Windows.

The job is pretty trivial:

universe = vanilla
transfer_files=always
requirements = ( Arch=="Intel") && ( OpSys=="WINNT50" )
executable = hosttest.bat
output = host55.out
log = host55.log
notification = Error
queue


but it just stays in the queue in the Matched state:



$ condor_q -analyze

Warning:  No PREEMPTION_REQUIREMENTS expression in config file ---
assuming FALSE



-- Submitter: root@xxxxxxxxxxxxxxx : <138.253.100.177:63437> :
ulgp2.liv.ac.uk ID OWNER SUBMITTED RUN_TIME ST PRI
SIZE CMD ---
004.000: Run analysis summary. Of 96 machines,
93 are rejected by your job's requirements
0 reject your job because of their own requirements
0 match, but are serving users with a better priority in the pool
3 match, but prefer another specific job despite its worse

user-priority


      0 match, but will not currently preempt their existing job
      0 are available to run your job
        Last successful match: Mon Dec 13 16:09:25 2004

1 jobs; 1 idle, 0 running, 0 held


a tail of SchedLog looks like:


12/13 16:09:24 Sent ad to central manager for smithic@xxxxxxxxxxxxxxx
12/13 16:09:25 Activity on stashed negotiator socket
12/13 16:09:25 Negotiating for owner: smithic@xxxxxxxxxxxxxxx
12/13 16:09:25 Checking consistency running and runnable jobs
12/13 16:09:25 Tables are consistent
12/13 16:09:25 Out of jobs - 1 jobs matched, 0 jobs idle, flock level = 0
12/13 16:09:25 condor_read(): recv() returned -1, errno = 131, assuming
failure. 12/13 16:09:25 Response problem from startd.
12/13 16:09:25 Sent RELEASE_CLAIM to startd on <138.253.102.199:1027>
12/13 16:09:25 Match record (<138.253.102.199:1027>, 4, 0) deleted
12/13 16:09:29 Sent ad to central manager for smithic@xxxxxxxxxxxxxxx
12/13 16:11:40 DaemonCore: PERMISSION DENIED to unknown user from host
<138.253.100.176:63491> for command 416 (NEGOTIATE)


I can't work out the PERMISSION DENIED error. The IP is that of the CondorView
host ???


I'm using Condor 6.6.5 under Solaris 9.

anyone have any ideas,

cheers,

-ian.