[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Problem Submit Job from Windows to be executed on OS X



Hi all,

I'm submitting jobs from windows to be run on OS X machines.
The dagman logs all go to the windows submitter machine's local drives to avoid
log lock issues.

The matching is successful and the job tries to run on the os x machine but after a while the job is returned to it's initial state and the job is sent to another os x machine to be run. ..this cycle keeps repeating until the
error count is reached.

Here is the Starter log from the OS X machine which is trying to run the job:

9/28 15:53:45 Submitting machine is "supermeepxp.afx.com"
9/28 15:53:45 File transfer completed successfully.
9/28 15:53:46 Starting a VANILLA universe job with ID: 94.0
9/28 15:53:46 IWD: /Users/condor//execute/dir_6105
9/28 15:53:46 Output file: /Users/condor//execute/dir_6105/ _condor_stdout
9/28 15:53:46 Error file: /Users/condor//execute/dir_6105/_condor_stderr
9/28 15:53:46 About to exec /Users/condor//execute/dir_6105/ condor_exec.exe 15 15 9/28 15:53:46 Create_Process: child failed with errno 8 (Exec format error) before exec() 9/28 15:53:46 ERROR "Create_Process(/Users/condor//execute/dir_6105/ condor_exec.exe,15 15, ...) failed" at line 393 in file os_proc.C
9/28 15:53:46 ShutdownFast all jobs.

Here is the Shadow log from the Windows Submitter:

9/28 15:53:19 ******************************************************
9/28 15:53:19 Using config source: C:\condor\condor_config
9/28 15:53:19 Using local config sources:
9/28 15:53:19    C:\condor/condor_config.local
9/28 15:53:19 DaemonCore: Command Socket at <192.168.0.44:1924>
9/28 15:53:43 Initializing a VANILLA shadow for job 94.0
9/28 15:53:43 (94.0) (1596): UserLog::initialize: safe_fopen_wrapper ("C:\Condor_Submit_Data\maya\rman_cube_09_28_2007_15_41_55/C:/ Condor_Submit_Data/maya/rman_cube_09_28_2007_15_41_55/ rman_cube_SubmitLog.txt") failed - errno 22 (Invalid argument) 9/28 15:53:43 (94.0) (1596): Request to run on <192.168.0.12:59683> was ACCEPTED 9/28 15:53:44 (94.0) (1596): Job 94.0 going into Hold state (code 6,8): Error from starter on vm2@xxxxxxxxxxxxxxxxx: Failed to execute '/Users/condor//execute/dir_6105/condor_exec.exe' with arguments 15 15: Exec format error 9/28 15:53:44 (94.0) (1596): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 112