[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Several Issues With Condor 6.7.20 and Windows



On 6/27/06, thomas.t.hoppe@xxxxxxxxxxxxxxxxxxx
<thomas.t.hoppe@xxxxxxxxxxxxxxxxxxx> wrote:

Hi,

I'm currently in the process of setting up a Condor pool as follows:

Host 1 under Linux:                Central Manager + Executor
Host 2 under Windows XP:        Executor


3.) And this is the worst problem:
I'm not able to do the simplest cross submit (from Linux to Windows).
I tried several configurations of vanilla jobs such as:

test_win.sub:

universe = vanilla
environment = path=c:\winnt\system32
executable = test_win.bat
output = C:\condor\printname.out.$(Process)
#error = printname.err
#log = printname.log
Arguments  = -arg1 -arg2
requirements = Activity=="idle" && Arch=="INTEL" && OpSys=="WINNT51"

queue


This leads to a starter.log on the windows machine:

6/27 09:52:36
******************************************************
6/27 09:52:36 ** condor_starter (CONDOR_STARTER) STARTING UP
6/27 09:52:36 ** C:\condor\bin\condor_starter.exe
6/27 09:52:36 ** $CondorVersion: 6.7.20 Jun 21 2006 $
6/27 09:52:36 ** $CondorPlatform: INTEL-WINNT50 $
6/27 09:52:36 ** PID = 1120
6/27 09:52:36 ** Log last touched 6/27 09:52:34
6/27 09:52:36
******************************************************
6/27 09:52:36 Using config source: C:\condor\condor_config
6/27 09:52:36 Using local config sources:
6/27 09:52:36    C:\condor/condor_config.local
6/27 09:52:36 DaemonCore: Command Socket at <#####:4069>
6/27 09:52:36 Setting resource limits not implemented!
6/27 09:52:36 Communicating with shadow <#####:35132>
6/27 09:52:36 Submitting machine is "#####"
6/27 09:52:36 Starting a VANILLA universe job with ID: 124.0
6/27 09:52:36 IWD: /home/hopptho/condor_test/test
6/27 09:52:36 Output file: C:\condor\printname.out.0
6/27 09:52:36 Renice expr "10" evaluated to 10
6/27 09:52:36 About to exec C:\WINDOWS\system32\cmd.exe condor_exec.exe /Q
/C condor_exec.bat -arg1 -arg2
6/27 09:52:36 Create_Process: CreateProcess failed, errno=267
6/27 09:52:36 ERROR
"Create_Process(C:\WINDOWS\system32\cmd.exe,condor_exec.exe
/Q /C condor_exec.bat -arg1 -arg2, ...) failed" at line 387 in file
..\src\condor_starter.V6.1\os_proc.C
6/27 09:52:36 ShutdownFast all jobs.

this is error 267 from the call to the windows api function CreateProcess.
It means "Invalid Directory" (though this can also mean that you have
access/privilege issues

Are you sure that C:\WINDOWS\system32\cmd.exe exists on the execute
machine and is accessible to absolutely any user on the machine
(sounds daft but this can in fact be locked out)

Is your windows machine setup to have the relevant dynamic almost no
privilege users condor-reuse-vm1 (or more if you have smp machines)
If so does it have access rights both to the cmd and to the directory
where execution happens.

My best guess however is that it does not have access rights to go
creating new files in c:\condor (as your output specifies) and it
really shouldn't be writing it here. It is best to leave the output
and error settings relative (they then become relative to your
submission directory for the purposes of being written back as well as
being written relatively on the execute machine (which is the best
place for them)

log meanwhile *can* be made absolute but I again suggest that on
windows keeping in relative is a good idea since this is written by
the shadow on the local machine.

This behaviour is totally intransparent for me. The manual claims, that any
combination of cross submits are possible but
there is no word about things like

condor_exec.bat

This file does not come with condor and I don't know its pripose.
I tried to create this file and add it to the path, but the error remains.

this file is simply condor internally renaming your batch file for the
copy across and execution - it avoids any issues with the name of the
script / exe being illegal on the target machine as well as making the
condor execution process easier to spot an track (you may dislike this
but if you have any program or script relying on it's own file name I
suggest you change it)

Another annoying problem are the files defined with "output", "error" and
"log".
If I do not specify a full, valid path on the windows machine, the submit
fails
because condor somehow sets the CWD of the submitting host then.
This in turn makes condor fail on the Windows host because it cannot open a
file like:

/home/condor/test_win.log

As explained above output and error should be left relative (just give
it a name without any directory indicating slashes)

I would have thought that the log setting would be happy either way.
it is opened by the local shadow on your submit machine only so should
have no issues but since my condor on *nix experience in nil I
wouldn't state this with absolute certainty.

Matt