[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Condor on Windows



Hi again!

Here are some of the log files for Condor. If you consider that the rest files contain some useful information, I could send them all.

 

MasterLog.log

8/24 10:50:10 ******************************************************

8/24 10:50:10 ** Condor (CONDOR_MASTER) STARTING UP

8/24 10:50:10 ** C:\condor\bin\condor_master.exe

8/24 10:50:10 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 10:50:10 ** $CondorPlatform: INTEL-WINNT50 $

8/24 10:50:10 ** PID = 2380

8/24 10:50:10 ** Log last touched time unavailable (No such file or directory)

8/24 10:50:10 ******************************************************

8/24 10:50:10 Using config source: C:\condor\condor_config

8/24 10:50:10 Using local config sources:

8/24 10:50:10    C:\condor/condor_config.local

8/24 10:50:10 DaemonCore: Command Socket at <10.254.254.219:1065>

8/24 10:50:10 Started DaemonCore process "C:\condor/bin/condor_collector.exe", pid and pgroup = 2436

8/24 10:50:10 Started DaemonCore process "C:\condor/bin/condor_negotiator.exe", pid and pgroup = 2468

8/24 10:50:10 Started DaemonCore process "C:\condor/bin/condor_schedd.exe", pid and pgroup = 2520

8/24 10:50:10 Started DaemonCore process "C:\condor/bin/condor_startd.exe", pid and pgroup = 3052

8/24 11:50:10 Preen pid is 1540

8/24 11:50:10 DaemonCore: Command received via UDP from host <10.254.254.219:1559>

8/24 11:50:10 DaemonCore: received command 60011 (DC_NOP), calling handler (handle_nop())

8/24 11:50:10 Child 1540 died, but not a daemon -- Ignored

8/24 15:57:59 DaemonCore: Command received via UDP from host <10.254.254.219:3726>

8/24 15:57:59 DaemonCore: received command 60000 (DC_RAISESIGNAL), calling handler (HandleSigCommand())

8/24 15:57:59 Got SIGQUIT.  Performing fast shutdown.

8/24 15:57:59 Sent signal 3 to COLLECTOR (pid 2436)

8/24 15:57:59 Sent signal 3 to NEGOTIATOR (pid 2468)

8/24 15:57:59 Sent signal 3 to SCHEDD (pid 2520)

8/24 15:57:59 Sent signal 3 to STARTD (pid 3052)

8/24 15:58:00 DaemonCore: Command received via UDP from host <10.254.254.219:3743>

8/24 15:58:00 DaemonCore: received command 60011 (DC_NOP), calling handler (handle_nop())

8/24 15:58:00 The COLLECTOR (pid 2436) exited with status 0

8/24 15:58:00 DaemonCore: Command received via UDP from host <10.254.254.219:3744>

8/24 15:58:00 DaemonCore: received command 60011 (DC_NOP), calling handler (handle_nop())

8/24 15:58:00 The NEGOTIATOR (pid 2468) exited with status 0

8/24 15:58:00 The SCHEDD (pid 2520) exited with status 0

8/24 15:58:00 The STARTD (pid 3052) exited with status 0

8/24 15:58:00 All daemons are gone.  Exiting.

8/24 15:58:00 **** Condor (condor_MASTER) EXITING WITH STATUS 0

 

 

StarterLog.log

8/24 14:16:15 ******************************************************

8/24 14:16:15 ** condor_starter (CONDOR_STARTER) STARTING UP

8/24 14:16:15 ** C:\condor\bin\condor_starter.exe

8/24 14:16:15 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:16:15 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:16:15 ** PID = 3684

8/24 14:16:15 ** Log last touched 8/24 14:15:44

8/24 14:16:15 ******************************************************

8/24 14:16:15 Using config source: C:\condor\condor_config

8/24 14:16:15 Using local config sources:

8/24 14:16:15    C:\condor/condor_config.local

8/24 14:16:15 DaemonCore: Command Socket at <10.254.254.219:2823>

8/24 14:16:15 Setting resource limits not implemented!

8/24 14:16:15 Communicating with shadow <10.254.254.219:2821>

8/24 14:16:15 Submitting machine is "comp12"

8/24 14:16:16 File transfer completed successfully.

8/24 14:16:17 Starting a VANILLA universe job with ID: 8.0

8/24 14:16:17 IWD: C:\condor/execute\dir_3684

8/24 14:16:17 Output file: C:\condor/execute\dir_3684\test.out

8/24 14:16:17 Error file: C:\condor/execute\dir_3684\test.err

8/24 14:16:17 Renice expr "10" evaluated to 10

8/24 14:16:17 About to exec C:\condor\execute\dir_3684\condor_exec.exe

8/24 14:16:17 ERROR: C:\condor\execute\dir_3684\condor_exec.exe is not a valid Windows executable

8/24 14:16:17 ERROR "Create_Process(C:\condor\execute\dir_3684\condor_exec.exe,, ...) failed" at line 393 in file ..\src\condor_starter.V6.1\os_proc.C

8/24 14:16:17 ShutdownFast all jobs.

8/24 14:33:27 ******************************************************

 

 

ShadowLog.log

8/24 10:52:47 ******************************************************

8/24 10:52:47 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 10:52:47 ** C:\condor\bin\condor_shadow.exe

8/24 10:52:47 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 10:52:47 ** $CondorPlatform: INTEL-WINNT50 $

8/24 10:52:47 ** PID = 2144

8/24 10:52:47 ** Log last touched time unavailable (No such file or directory)

8/24 10:52:47 ******************************************************

8/24 10:52:47 Using config source: C:\condor\condor_config

8/24 10:52:47 Using local config sources:

8/24 10:52:47    C:\condor/condor_config.local

8/24 10:52:47 DaemonCore: Command Socket at <10.254.254.219:1112>

8/24 10:52:47 Initializing a VANILLA shadow for job 1.0

8/24 10:52:47 (1.0) (2144): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 10:52:48 (1.0) (2144): Job 1.0 terminated: exited with status 0

8/24 10:52:48 (1.0) (2144): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 100

8/24 10:54:24 ******************************************************

8/24 10:54:24 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 10:54:24 ** C:\condor\bin\condor_shadow.exe

8/24 10:54:24 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 10:54:24 ** $CondorPlatform: INTEL-WINNT50 $

8/24 10:54:24 ** PID = 2316

8/24 10:54:24 ** Log last touched 8/24 10:52:48

8/24 10:54:24 ******************************************************

8/24 10:54:24 Using config source: C:\condor\condor_config

8/24 10:54:24 Using local config sources:

8/24 10:54:24    C:\condor/condor_config.local

8/24 10:54:24 DaemonCore: Command Socket at <10.254.254.219:1160>

8/24 10:54:24 Initializing a VANILLA shadow for job 2.0

8/24 10:54:25 (2.0) (2316): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 10:54:26 (2.0) (2316): ERROR "Error from starter on vm1@comp12: Create_Process(C:\condor\execute\dir_2336\condor_exec.exe,, ...) failed" at line 643 in file ..\src\condor_shadow.V6.1\pseudo_ops.C

8/24 10:54:28 ******************************************************

8/24 10:54:28 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 10:54:28 ** C:\condor\bin\condor_shadow.exe

8/24 10:54:28 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 10:54:28 ** $CondorPlatform: INTEL-WINNT50 $

8/24 10:54:28 ** PID = 2752

8/24 10:54:28 ** Log last touched 8/24 10:54:26

8/24 10:54:28 ******************************************************

8/24 10:54:28 Using config source: C:\condor\condor_config

8/24 10:54:28 Using local config sources:

8/24 10:54:28    C:\condor/condor_config.local

8/24 10:54:28 DaemonCore: Command Socket at <10.254.254.219:1177>

8/24 10:54:28 Initializing a VANILLA shadow for job 2.0

8/24 10:54:28 (2.0) (2752): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 10:54:30 (2.0) (2752): condor_read(): recv() returned -1, errno = 10054, assuming failure reading 5 bytes from <10.254.254.219:1068>.

8/24 10:54:30 (2.0) (2752): Can no longer talk to condor_starter <10.254.254.219:1068>

8/24 10:54:30 (2.0) (2752): Trying to reconnect to disconnected job

8/24 10:54:30 (2.0) (2752): LastJobLeaseRenewal: 1187942070 Fri Aug 24 10:54:30 2007

8/24 10:54:30 (2.0) (2752): JobLeaseDuration: 1200 seconds

8/24 10:54:30 (2.0) (2752): JobLeaseDuration remaining: 1200

8/24 10:54:30 (2.0) (2752): Attempting to locate disconnected starter

8/24 10:54:30 (2.0) (2752): Found starter: <10.254.254.219:1179>

8/24 10:54:30 (2.0) (2752): Attempting to reconnect to starter <10.254.254.219:1179>

8/24 10:54:31 (2.0) (2752): attempt to connect to <10.254.254.219:1179> failed: connect errno = 10061 connection refused.

8/24 10:54:31 (2.0) (2752): Attempt to reconnect failed: Failed to connect to starter <10.254.254.219:1179>

8/24 10:54:31 (2.0) (2752): JobLeaseDuration remaining: 1199

8/24 10:54:31 (2.0) (2752): Scheduling another attempt to reconnect in 8 seconds

8/24 10:54:39 (2.0) (2752): Attempting to locate disconnected starter

8/24 10:54:39 (2.0) (2752): locateStarter(): ClaimId (<10.254.254.219:1068>#1187941812#4) and GlobalJobId ( comp12#1187942060#2.0 ) not found

8/24 10:54:39 (2.0) (2752): Reconnect FAILED: Job not found at execution machine

8/24 10:54:39 (2.0) (2752): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107

8/24 14:15:42 ******************************************************

8/24 14:15:42 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:15:42 ** C:\condor\bin\condor_shadow.exe

8/24 14:15:42 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:15:42 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:15:42 ** PID = 2252

8/24 14:15:42 ** Log last touched 8/24 10:54:39

8/24 14:15:42 ******************************************************

8/24 14:15:42 Using config source: C:\condor\condor_config

8/24 14:15:42 Using local config sources:

8/24 14:15:42    C:\condor/condor_config.local

8/24 14:15:42 DaemonCore: Command Socket at <10.254.254.219:2756>

8/24 14:15:42 Initializing a VANILLA shadow for job 6.0

8/24 14:15:42 (6.0) (2252): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:15:44 ******************************************************

8/24 14:15:44 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:15:44 ** C:\condor\bin\condor_shadow.exe

8/24 14:15:44 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:15:44 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:15:44 ** PID = 3252

8/24 14:15:44 ** Log last touched 8/24 14:15:42

8/24 14:15:44 ******************************************************

8/24 14:15:44 Using config source: C:\condor\condor_config

8/24 14:15:44 Using local config sources:

8/24 14:15:44    C:\condor/condor_config.local

8/24 14:15:44 DaemonCore: Command Socket at <10.254.254.219:2767>

8/24 14:15:44 Initializing a VANILLA shadow for job 7.0

8/24 14:15:44 (7.0) (3252): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:15:44 (6.0) (2252): ERROR "Error from starter on vm1@comp12: Create_Process(C:\condor\execute\dir_3500\condor_exec.exe,, ...) failed" at line 643 in file ..\src\condor_shadow.V6.1\pseudo_ops.C

8/24 14:15:46 (7.0) (3252): ERROR "Error from starter on vm2@comp12: Create_Process(C:\condor\execute\dir_3288\condor_exec.exe,, ...) failed" at line 643 in file ..\src\condor_shadow.V6.1\pseudo_ops.C

8/24 14:16:15 ******************************************************

8/24 14:16:15 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:16:15 ** C:\condor\bin\condor_shadow.exe

8/24 14:16:15 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:16:15 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:16:15 ** PID = 1648

8/24 14:16:15 ** Log last touched 8/24 14:15:46

8/24 14:16:15 ******************************************************

8/24 14:16:15 Using config source: C:\condor\condor_config

8/24 14:16:15 Using local config sources:

8/24 14:16:15    C:\condor/condor_config.local

8/24 14:16:15 DaemonCore: Command Socket at <10.254.254.219:2821>

8/24 14:16:15 Initializing a VANILLA shadow for job 8.0

8/24 14:16:15 (8.0) (1648): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:16:17 (8.0) (1648): condor_read(): recv() returned -1, errno = 10054, assuming failure reading 5 bytes from <10.254.254.219:1068>.

8/24 14:16:17 (8.0) (1648): Can no longer talk to condor_starter <10.254.254.219:1068>

8/24 14:16:17 (8.0) (1648): Trying to reconnect to disconnected job

8/24 14:16:17 (8.0) (1648): LastJobLeaseRenewal: 1187954177 Fri Aug 24 14:16:17 2007

8/24 14:16:17 (8.0) (1648): JobLeaseDuration: 1200 seconds

8/24 14:16:17 (8.0) (1648): JobLeaseDuration remaining: 1200

8/24 14:16:17 (8.0) (1648): Attempting to locate disconnected starter

8/24 14:16:17 (8.0) (1648): Found starter: <10.254.254.219:2823>

8/24 14:16:17 (8.0) (1648): Attempting to reconnect to starter <10.254.254.219:2823>

8/24 14:16:18 (8.0) (1648): attempt to connect to <10.254.254.219:2823> failed: connect errno = 10061 connection refused.

8/24 14:16:18 (8.0) (1648): Attempt to reconnect failed: Failed to connect to starter <10.254.254.219:2823>

8/24 14:16:18 (8.0) (1648): JobLeaseDuration remaining: 1199

8/24 14:16:18 (8.0) (1648): Scheduling another attempt to reconnect in 8 seconds

8/24 14:16:26 (8.0) (1648): Attempting to locate disconnected starter

8/24 14:16:26 (8.0) (1648): locateStarter(): ClaimId (<10.254.254.219:1068>#1187941812#11) and GlobalJobId ( comp12#1187954170#8.0 ) not found

8/24 14:16:26 (8.0) (1648): Reconnect FAILED: Job not found at execution machine

8/24 14:16:26 (8.0) (1648): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107

8/24 14:33:27 ******************************************************

8/24 14:33:27 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:33:27 ** C:\condor\bin\condor_shadow.exe

8/24 14:33:27 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:33:27 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:33:27 ** PID = 2752

8/24 14:33:27 ** Log last touched 8/24 14:16:26

8/24 14:33:27 ******************************************************

8/24 14:33:27 Using config source: C:\condor\condor_config

8/24 14:33:27 Using local config sources:

8/24 14:33:27    C:\condor/condor_config.local

8/24 14:33:27 DaemonCore: Command Socket at <10.254.254.219:2959>

8/24 14:33:27 Initializing a VANILLA shadow for job 9.0

8/24 14:33:27 (9.0) (2752): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:33:28 (9.0) (2752): ERROR "Error from starter on vm1@comp12: Create_Process(C:\condor\execute\dir_2880\condor_exec.exe,, ...) failed" at line 643 in file ..\src\condor_shadow.V6.1\pseudo_ops.C

8/24 14:33:29 ******************************************************

8/24 14:33:29 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:33:29 ** C:\condor\bin\condor_shadow.exe

8/24 14:33:29 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:33:29 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:33:29 ** PID = 684

8/24 14:33:29 ** Log last touched 8/24 14:33:28

8/24 14:33:29 ******************************************************

8/24 14:33:29 Using config source: C:\condor\condor_config

8/24 14:33:29 Using local config sources:

8/24 14:33:29    C:\condor/condor_config.local

8/24 14:33:29 DaemonCore: Command Socket at <10.254.254.219:2976>

8/24 14:33:29 Initializing a VANILLA shadow for job 10.0

8/24 14:33:29 (10.0) (684): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:33:30 (10.0) (684): ERROR "Error from starter on vm2@comp12: Create_Process(C:\condor\execute\dir_3120\condor_exec.exe,, ...) failed" at line 643 in file ..\src\condor_shadow.V6.1\pseudo_ops.C

8/24 14:33:31 ******************************************************

8/24 14:33:31 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:33:31 ** C:\condor\bin\condor_shadow.exe

8/24 14:33:31 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:33:31 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:33:31 ** PID = 1936

8/24 14:33:31 ** Log last touched 8/24 14:33:30

8/24 14:33:31 ******************************************************

8/24 14:33:31 Using config source: C:\condor\condor_config

8/24 14:33:31 Using local config sources:

8/24 14:33:31    C:\condor/condor_config.local

8/24 14:33:31 DaemonCore: Command Socket at <10.254.254.219:2991>

8/24 14:33:31 Initializing a VANILLA shadow for job 9.0

8/24 14:33:31 (9.0) (1936): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:33:32 (9.0) (1936): condor_read(): recv() returned -1, errno = 10054, assuming failure reading 5 bytes from <10.254.254.219:1068>.

8/24 14:33:32 (9.0) (1936): Can no longer talk to condor_starter <10.254.254.219:1068>

8/24 14:33:33 (9.0) (1936): Trying to reconnect to disconnected job

8/24 14:33:33 (9.0) (1936): LastJobLeaseRenewal: 1187955212 Fri Aug 24 14:33:32 2007

8/24 14:33:33 (9.0) (1936): JobLeaseDuration: 1200 seconds

8/24 14:33:33 (9.0) (1936): JobLeaseDuration remaining: 1199

8/24 14:33:33 (9.0) (1936): Attempting to locate disconnected starter

8/24 14:33:33 (9.0) (1936): Found starter: <10.254.254.219:2993>

8/24 14:33:33 (9.0) (1936): Attempting to reconnect to starter <10.254.254.219:2993>

8/24 14:33:33 ******************************************************

8/24 14:33:33 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:33:33 ** C:\condor\bin\condor_shadow.exe

8/24 14:33:33 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:33:33 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:33:33 ** PID = 3220

8/24 14:33:33 ** Log last touched 8/24 14:33:33

8/24 14:33:33 ******************************************************

8/24 14:33:33 Using config source: C:\condor\condor_config

8/24 14:33:33 Using local config sources:

8/24 14:33:33    C:\condor/condor_config.local

8/24 14:33:33 DaemonCore: Command Socket at <10.254.254.219:3003>

8/24 14:33:33 Initializing a VANILLA shadow for job 10.0

8/24 14:33:34 (10.0) (3220): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:33:34 (9.0) (1936): attempt to connect to <10.254.254.219:2993> failed: connect errno = 10061 connection refused.

8/24 14:33:34 (9.0) (1936): Attempt to reconnect failed: Failed to connect to starter <10.254.254.219:2993>

8/24 14:33:34 (9.0) (1936): JobLeaseDuration remaining: 1199

8/24 14:33:34 (9.0) (1936): Scheduling another attempt to reconnect in 8 seconds

8/24 14:33:35 (10.0) (3220): ERROR "Error from starter on vm2@comp12: Create_Process(C:\condor\execute\dir_3692\condor_exec.exe,, ...) failed" at line 643 in file ..\src\condor_shadow.V6.1\pseudo_ops.C

8/24 14:33:38 ******************************************************

8/24 14:33:38 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:33:38 ** C:\condor\bin\condor_shadow.exe

8/24 14:33:38 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:33:38 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:33:38 ** PID = 208

8/24 14:33:38 ** Log last touched 8/24 14:33:35

8/24 14:33:38 ******************************************************

8/24 14:33:38 Using config source: C:\condor\condor_config

8/24 14:33:38 Using local config sources:

8/24 14:33:38    C:\condor/condor_config.local

8/24 14:33:38 DaemonCore: Command Socket at <10.254.254.219:3022>

8/24 14:33:38 Initializing a VANILLA shadow for job 10.0

8/24 14:33:38 (10.0) (208): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:33:39 (10.0) (208): ERROR "Error from starter on vm2@comp12: Create_Process(C:\condor\execute\dir_3668\condor_exec.exe,, ...) failed" at line 643 in file ..\src\condor_shadow.V6.1\pseudo_ops.C

8/24 14:33:42 (9.0) (1936): Attempting to locate disconnected starter

8/24 14:33:42 (9.0) (1936): locateStarter(): ClaimId (<10.254.254.219:1068>#1187941812#14) and GlobalJobId ( comp12#1187955110#9.0 ) not found

8/24 14:33:42 (9.0) (1936): Reconnect FAILED: Job not found at execution machine

8/24 14:33:42 (9.0) (1936): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107

8/24 14:33:42 ******************************************************

8/24 14:33:42 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:33:42 ** C:\condor\bin\condor_shadow.exe

8/24 14:33:42 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:33:42 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:33:42 ** PID = 2316

8/24 14:33:42 ** Log last touched 8/24 14:33:42

8/24 14:33:42 ******************************************************

8/24 14:33:42 Using config source: C:\condor\condor_config

8/24 14:33:42 Using local config sources:

8/24 14:33:42    C:\condor/condor_config.local

8/24 14:33:42 DaemonCore: Command Socket at <10.254.254.219:3044>

8/24 14:33:42 Initializing a VANILLA shadow for job 10.0

8/24 14:33:42 (10.0) (2316): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:33:43 (10.0) (2316): condor_read(): recv() returned -1, errno = 10054, assuming failure reading 5 bytes from <10.254.254.219:1068>.

8/24 14:33:43 (10.0) (2316): Can no longer talk to condor_starter <10.254.254.219:1068>

8/24 14:33:44 (10.0) (2316): Trying to reconnect to disconnected job

8/24 14:33:44 (10.0) (2316): LastJobLeaseRenewal: 1187955223 Fri Aug 24 14:33:43 2007

8/24 14:33:44 (10.0) (2316): JobLeaseDuration: 1200 seconds

8/24 14:33:44 (10.0) (2316): JobLeaseDuration remaining: 1199

8/24 14:33:44 (10.0) (2316): Attempting to locate disconnected starter

8/24 14:33:44 (10.0) (2316): Found starter: <10.254.254.219:3046>

8/24 14:33:44 (10.0) (2316): Attempting to reconnect to starter <10.254.254.219:3046>

8/24 14:33:44 (10.0) (2316): attempt to connect to <10.254.254.219:3046> failed: connect errno = 10061 connection refused.

8/24 14:33:44 (10.0) (2316): Attempt to reconnect failed: Failed to connect to starter <10.254.254.219:3046>

8/24 14:33:44 (10.0) (2316): JobLeaseDuration remaining: 1200

8/24 14:33:44 (10.0) (2316): Scheduling another attempt to reconnect in 8 seconds

8/24 14:33:52 (10.0) (2316): Attempting to locate disconnected starter

8/24 14:33:53 (10.0) (2316): locateStarter(): ClaimId (<10.254.254.219:1068>#1187941812#12) and GlobalJobId ( comp12#1187955203#10.0 ) not found

8/24 14:33:53 (10.0) (2316): Reconnect FAILED: Job not found at execution machine

8/24 14:33:53 (10.0) (2316): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107

8/24 14:50:48 ******************************************************

8/24 14:50:48 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:50:48 ** C:\condor\bin\condor_shadow.exe

8/24 14:50:48 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:50:48 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:50:48 ** PID = 1452

8/24 14:50:48 ** Log last touched 8/24 14:33:53

8/24 14:50:48 ******************************************************

8/24 14:50:48 Using config source: C:\condor\condor_config

8/24 14:50:48 Using local config sources:

8/24 14:50:48    C:\condor/condor_config.local

8/24 14:50:48 DaemonCore: Command Socket at <10.254.254.219:3211>

8/24 14:50:48 Initializing a VANILLA shadow for job 11.0

8/24 14:50:48 (11.0) (1452): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:50:49 (11.0) (1452): condor_read(): recv() returned -1, errno = 10054, assuming failure reading 5 bytes from <10.254.254.219:1068>.

8/24 14:50:49 (11.0) (1452): Can no longer talk to condor_starter <10.254.254.219:1068>

8/24 14:50:49 (11.0) (1452): Trying to reconnect to disconnected job

8/24 14:50:49 (11.0) (1452): LastJobLeaseRenewal: 1187956249 Fri Aug 24 14:50:49 2007

8/24 14:50:49 (11.0) (1452): JobLeaseDuration: 1200 seconds

8/24 14:50:49 (11.0) (1452): JobLeaseDuration remaining: 1200

8/24 14:50:49 (11.0) (1452): Attempting to locate disconnected starter

8/24 14:50:49 (11.0) (1452): Found starter: <10.254.254.219:3216>

8/24 14:50:49 (11.0) (1452): Attempting to reconnect to starter <10.254.254.219:3216>

8/24 14:50:50 (11.0) (1452): attempt to connect to <10.254.254.219:3216> failed: connect errno = 10061 connection refused.

8/24 14:50:50 (11.0) (1452): Attempt to reconnect failed: Failed to connect to starter <10.254.254.219:3216>

8/24 14:50:50 (11.0) (1452): JobLeaseDuration remaining: 1199

8/24 14:50:50 (11.0) (1452): Scheduling another attempt to reconnect in 8 seconds

8/24 14:50:58 (11.0) (1452): Attempting to locate disconnected starter

8/24 14:50:58 (11.0) (1452): locateStarter(): ClaimId (<10.254.254.219:1068>#1187941812#17) and GlobalJobId ( comp12#1187956244#11.0 ) not found

8/24 14:50:58 (11.0) (1452): Reconnect FAILED: Job not found at execution machine

8/24 14:50:58 (11.0) (1452): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107

8/24 14:52:23 ******************************************************

8/24 14:52:23 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:52:23 ** C:\condor\bin\condor_shadow.exe

8/24 14:52:23 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:52:23 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:52:23 ** PID = 612

8/24 14:52:23 ** Log last touched 8/24 14:50:58

8/24 14:52:23 ******************************************************

8/24 14:52:23 Using config source: C:\condor\condor_config

8/24 14:52:23 Using local config sources:

8/24 14:52:23    C:\condor/condor_config.local

8/24 14:52:23 DaemonCore: Command Socket at <10.254.254.219:3268>

8/24 14:52:23 Initializing a VANILLA shadow for job 12.0

8/24 14:52:23 (12.0) (612): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:52:25 (12.0) (612): ERROR "Error from starter on vm1@comp12: Create_Process(C:\condor\execute\dir_3812\condor_exec.exe,, ...) failed" at line 643 in file ..\src\condor_shadow.V6.1\pseudo_ops.C

8/24 14:52:27 ******************************************************

8/24 14:52:27 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:52:27 ** C:\condor\bin\condor_shadow.exe

8/24 14:52:27 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:52:27 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:52:27 ** PID = 2648

8/24 14:52:27 ** Log last touched 8/24 14:52:25

8/24 14:52:27 ******************************************************

8/24 14:52:27 Using config source: C:\condor\condor_config

8/24 14:52:27 Using local config sources:

8/24 14:52:27    C:\condor/condor_config.local

8/24 14:52:27 DaemonCore: Command Socket at <10.254.254.219:3286>

8/24 14:52:27 Initializing a VANILLA shadow for job 12.0

8/24 14:52:27 (12.0) (2648): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:52:29 (12.0) (2648): ERROR "Error from starter on vm1@comp12: Create_Process(C:\condor\execute\dir_2744\condor_exec.exe,, ...) failed" at line 643 in file ..\src\condor_shadow.V6.1\pseudo_ops.C

8/24 14:52:31 ******************************************************

8/24 14:52:31 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:52:31 ** C:\condor\bin\condor_shadow.exe

8/24 14:52:31 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:52:31 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:52:31 ** PID = 1360

8/24 14:52:31 ** Log last touched 8/24 14:52:29

8/24 14:52:31 ******************************************************

8/24 14:52:31 Using config source: C:\condor\condor_config

8/24 14:52:31 Using local config sources:

8/24 14:52:31    C:\condor/condor_config.local

8/24 14:52:31 DaemonCore: Command Socket at <10.254.254.219:3305>

8/24 14:52:31 Initializing a VANILLA shadow for job 12.0

8/24 14:52:31 (12.0) (1360): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:52:33 (12.0) (1360): ERROR "Error from starter on vm1@comp12: Create_Process(C:\condor\execute\dir_3264\condor_exec.exe,, ...) failed" at line 643 in file ..\src\condor_shadow.V6.1\pseudo_ops.C

8/24 14:52:35 ******************************************************

8/24 14:52:35 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:52:35 ** C:\condor\bin\condor_shadow.exe

8/24 14:52:35 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:52:35 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:52:35 ** PID = 3804

8/24 14:52:35 ** Log last touched 8/24 14:52:33

8/24 14:52:35 ******************************************************

8/24 14:52:35 Using config source: C:\condor\condor_config

8/24 14:52:35 Using local config sources:

8/24 14:52:35    C:\condor/condor_config.local

8/24 14:52:35 DaemonCore: Command Socket at <10.254.254.219:3323>

8/24 14:52:35 Initializing a VANILLA shadow for job 12.0

8/24 14:52:35 (12.0) (3804): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:52:36 (12.0) (3804): ERROR "Error from starter on vm1@comp12: Create_Process(C:\condor\execute\dir_528\condor_exec.exe,, ...) failed" at line 643 in file ..\src\condor_shadow.V6.1\pseudo_ops.C

8/24 14:52:39 ******************************************************

8/24 14:52:39 ** condor_shadow (CONDOR_SHADOW) STARTING UP

8/24 14:52:39 ** C:\condor\bin\condor_shadow.exe

8/24 14:52:39 ** $CondorVersion: 6.8.4 Feb  1 2007 $

8/24 14:52:39 ** $CondorPlatform: INTEL-WINNT50 $

8/24 14:52:39 ** PID = 1220

8/24 14:52:39 ** Log last touched 8/24 14:52:36

8/24 14:52:39 ******************************************************

8/24 14:52:39 Using config source: C:\condor\condor_config

8/24 14:52:39 Using local config sources:

8/24 14:52:39    C:\condor/condor_config.local

8/24 14:52:39 DaemonCore: Command Socket at <10.254.254.219:3345>

8/24 14:52:39 Initializing a VANILLA shadow for job 12.0

8/24 14:52:40 (12.0) (1220): Request to run on <10.254.254.219:1068> was ACCEPTED

8/24 14:52:41 (12.0) (1220): ERROR "Error from starter on vm1@comp12: Create_Process(C:\condor\execute\dir_2860\condor_exec.exe,, ...) failed" at line 643 in file ..\src\condor_shadow.V6.1\pseudo_ops.C

 

 

I had a doubdt that the problem could be in the compiler (in this case Turbo Pascal 7). I tried to compile and link an executable with Borland C. The result was exactly the same.

I would be happy to understand where to seek the solution. Thanks in advance!

 

Ivailo Penev