[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] broken condor_exec.exe path on condor-g submit to windows pool



I'm having some problems running jobs on a remote windows pool..

The setup: Condor-G -> GT5 resource w/ Condor LRMS -> WinXP startd

The remote pool doesn't have a shared filesystem, so I use globusrsl in the condor-g submit file to tell the remote pool it has to use its own file transfer mechanism. It has separate queues (globus jobmanagers) for linux and windows jobs. The linux queue works well, but in the windows queue, jobs fail to execute at the startd:

===============================================================
06/18 22:22:08 Starting a VANILLA universe job with ID: 37.0
06/18 22:22:09 Tracking process family by login "condor-reuse-slot1"
06/18 22:22:09 IWD: C:\Progra~1\Condor\execute\dir_428
06/18 22:22:09 Output file: C:\Progra~1\Condor\execute\dir_428\_condor_stdout
06/18 22:22:10 Error file: C:\Progra~1\Condor\execute\dir_428\_condor_stderr
06/18 22:22:10 Renice expr "10" evaluated to 10
06/18 22:22:10 About to exec C:\Progra~1\Condor\execute\dir_428\condor_exec.gass_cache/local/md5/32/4a5afa0e96a19fa2244e8dd70116ce/md5/f4/91c2ba79ec192a3e127340f8999f71/data 06/18 22:22:10 GetExecutableAndArgumentsByExtention: failed to find extension *.gass_cache/local/md5/32/4a5afa0e96a19fa2244e8dd70116ce/md5/f4/91c2ba79ec192a3e127340f8999f71/data in the registry (last-error =
 2).
06/18 22:22:10 Create_Process(): Failed to find an executable for extension *.gass_cache/local/md5/32/4a5afa0e96a19fa2244e8dd70116ce/md5/f4/91c2ba79ec192a3e127340f8999f71/data 06/18 22:22:10 ERROR: C:\Progra~1\Condor\execute\dir_428\condor_exec.gass_cache\local\md5\32\4a5afa0e96a19fa2244e8dd70116ce\md5\f4\91c2ba79ec192a3e127340f8999f71\data.exe is not a valid Windows executable 06/18 22:22:10 ERROR "Create_Process(C:\Progra~1\Condor\execute\dir_428\condor_exec.gass_cache/local/md5/32/4a5afa0e96a19fa2244e8dd70116ce/md5/f4/91c2ba79ec192a3e127340f8999f71/data,, ...) failed: " at line
530 in file ..\src\condor_starter.V6.1\os_proc.cpp
06/18 22:22:10 ShutdownFast all jobs.
===============================================================

The gass_cache part should not be there.. if I understand correctly, condor-g uses gass internally to transfer files to the gt5 resource, but once they are there and a condor_submit is generated, condor's own file transfer mechanism kicks in and the startd should never see the cache url? I've checked to make sure the executable is transferred to Condor\execute\dir_pid\condor_exec.exe on the WinXP startd, and it is.

How can I make this work?

TIA, Rob