[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Several Issues With Condor 6.7.20 and Windows




Hi,

I'm currently in the process of setting up a Condor pool as follows:

Host 1 under Linux:                Central Manager + Executor
Host 2 under Windows XP:        Executor

I encountered several issues with that setup which I want to pint out here:

1.) The Windows installer relies on the user Everyone to exist. This is weak
design as this user only exists in the English version of Windows.
I has to create and delete this user otherwise the installer fails.

2.) The Java status of my Windows host is not advertised ->
doing a "condor_status -java" returns only the linux host.
I set path to java.exe in the config. What can be wrong here?

3.) And this is the worst problem:
I'm not able to do the simplest cross submit (from Linux to Windows).
I tried several configurations of vanilla jobs such as:

test_win.sub:

universe = vanilla
environment = path=c:\winnt\system32
executable = test_win.bat
output = C:\condor\printname.out.$(Process)
#error = printname.err
#log = printname.log
Arguments  = -arg1 -arg2
requirements = Activity=="idle" && Arch=="INTEL" && OpSys=="WINNT51"

queue


This leads to a starter.log on the windows machine:

6/27 09:52:36 ******************************************************
6/27 09:52:36 ** condor_starter (CONDOR_STARTER) STARTING UP
6/27 09:52:36 ** C:\condor\bin\condor_starter.exe
6/27 09:52:36 ** $CondorVersion: 6.7.20 Jun 21 2006 $
6/27 09:52:36 ** $CondorPlatform: INTEL-WINNT50 $
6/27 09:52:36 ** PID = 1120
6/27 09:52:36 ** Log last touched 6/27 09:52:34
6/27 09:52:36 ******************************************************
6/27 09:52:36 Using config source: C:\condor\condor_config
6/27 09:52:36 Using local config sources:
6/27 09:52:36    C:\condor/condor_config.local
6/27 09:52:36 DaemonCore: Command Socket at <#####:4069>
6/27 09:52:36 Setting resource limits not implemented!
6/27 09:52:36 Communicating with shadow <#####:35132>
6/27 09:52:36 Submitting machine is "#####"
6/27 09:52:36 Starting a VANILLA universe job with ID: 124.0
6/27 09:52:36 IWD: /home/hopptho/condor_test/test
6/27 09:52:36 Output file: C:\condor\printname.out.0
6/27 09:52:36 Renice expr "10" evaluated to 10
6/27 09:52:36 About to exec C:\WINDOWS\system32\cmd.exe condor_exec.exe /Q /C condor_exec.bat -arg1 -arg2
6/27 09:52:36 Create_Process: CreateProcess failed, errno=267
6/27 09:52:36 ERROR "Create_Process(C:\WINDOWS\system32\cmd.exe,condor_exec.exe /Q /C condor_exec.bat -arg1 -arg2, ...) failed" at line 387 in file ..\src\condor_starter.V6.1\os_proc.C
6/27 09:52:36 ShutdownFast all jobs.

This behaviour is totally intransparent for me. The manual claims, that any combination of cross submits are possible but
there is no word about things like

condor_exec.bat

This file does not come with condor and I don't know its pripose.
I tried to create this file and add it to the path, but the error remains.

Another annoying problem are the files defined with "output", "error" and "log".
If I do not specify a full, valid path on the windows machine, the submit fails
because condor somehow sets the CWD of the submitting host then.
This in turn makes condor fail on the Windows host because it cannot open a file like:

/home/condor/test_win.log

Did I missunderstood something, is there a better job description for such cross submits
or did I discover some bugs here?


Kind Regards
Thomas H.