[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Job was evicted and never finish!




Hi!

I really appreciate if someone gave me some hints about my problem :
when I send job to my pool , the job frequently evicted and restart from another node ( the executers node are windows ) i copy part of userlog and also ShadowLog and i highlighted the time of evicting in both logs.

for example in userlog i have :

011 (646.000.000) 06/23 11:22:23 Job was unsuspended.
...
004 (646.000.000) 06/23 11:22:23 Job was evicted.
       (0) Job was not checkpointed.
               Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
               Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
       0  -  Run Bytes Sent By Job
       100352  -  Run Bytes Received By Job
...
and at the same time in ShadowLog i have:

6/23 11:22:23 (646.0) (21691): About to decode condor_sysnum
6/23 11:22:23 (646.0) (21691): Got request for syscall job_exit (-65)
6/23 11:22:23 (646.0) (21691): in pseudo_job_exit: status=-1073741510,reason=107
6/23 11:22:23 (646.0) (21691):  rval = 0, errno = 25
6/23 11:22:23 (646.0) (21691): Shadow: do_REMOTE_syscall returned < 0
6/23 11:22:23 (646.0) (21691): Job 646.0 is being evicted
6/23 11:22:23 (646.0) (21691): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107



Regards,
Alex


---------------------USER LOG--------------------------------------

.
.
001 (646.000.000) 06/22 12:15:42 Job executing on host: <165.134.71.109:1030>
...
010 (646.000.000) 06/22 13:54:26 Job was suspended.
       Number of processes actually suspended: 1
...
011 (646.000.000) 06/22 13:56:02 Job was unsuspended.
...
010 (646.000.000) 06/22 16:44:19 Job was suspended.
       Number of processes actually suspended: 1
...
011 (646.000.000) 06/22 16:45:20 Job was unsuspended.
...
010 (646.000.000) 06/22 18:49:03 Job was suspended.
       Number of processes actually suspended: 1
...
011 (646.000.000) 06/22 18:49:28 Job was unsuspended.
...
010 (646.000.000) 06/23 01:36:04 Job was suspended.
       Number of processes actually suspended: 1
...
011 (646.000.000) 06/23 01:43:20 Job was unsuspended.
...
010 (646.000.000) 06/23 03:06:21 Job was suspended.
       Number of processes actually suspended: 1
...
011 (646.000.000) 06/23 03:12:42 Job was unsuspended.
...
010 (646.000.000) 06/23 04:54:57 Job was suspended.
       Number of processes actually suspended: 1
...
011 (646.000.000) 06/23 04:55:42 Job was unsuspended.
...
010 (646.000.000) 06/23 11:00:02 Job was suspended.
       Number of processes actually suspended: 1
...
011 (646.000.000) 06/23 11:01:08 Job was unsuspended.
...
010 (646.000.000) 06/23 11:01:23 Job was suspended.
       Number of processes actually suspended: 1
...
011 (646.000.000) 06/23 11:09:19 Job was unsuspended.
...
010 (646.000.000) 06/23 11:12:20 Job was suspended.
       Number of processes actually suspended: 1
...
011 (646.000.000) 06/23 11:22:23 Job was unsuspended.
...
004 (646.000.000) 06/23 11:22:23 Job was evicted.
       (0) Job was not checkpointed.
               Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
               Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
       0  -  Run Bytes Sent By Job
       100352  -  Run Bytes Received By Job
...
001 (646.000.000) 06/23 11:40:57 Job executing on host: < 165.134.74.172:1030>
...
010 (646.000.000) 06/23 14:37:28 Job was suspended.
       Number of processes actually suspended: 1
...
011 (646.000.000) 06/23 14:47:32 Job was unsuspended.
...
004 (646.000.000) 06/23 14:47:33 Job was evicted.
       (0) Job was not checkpointed.
               Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
               Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
       0  -  Run Bytes Sent By Job
       100352  -  Run Bytes Received By Job
...
001 (646.000.000) 06/23 15:21:06 Job executing on host: < 165.134.71.190:1032>
...
010 (646.000.000) 06/23 17:41:49 Job was suspended.
       Number of processes actually suspended: 1
...
011 (646.000.000) 06/23 17:51:53 Job was unsuspended.
...
004 (646.000.000) 06/23 17:51:54 Job was evicted.
       (0) Job was not checkpointed.
               Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
               Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
       0  -  Run Bytes Sent By Job
       100352  -  Run Bytes Received By Job
...
001 (646.000.000) 06/23 18:01:01 Job executing on host: < 165.134.71.189:1044>
...
010 (646.000.000) 06/23 18:32:30 Job was suspended.
       Number of processes actually suspended: 1
...
011 (646.000.000) 06/23 18:33:13 Job was unsuspended.
...
006 (646.000.000) 06/23 18:33:13 Image size of job updated: 1344
...
004 (646.000.000) 06/23 18:33:14 Job was evicted.
       (0) Job was not checkpointed.
               Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
               Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
       0  -  Run Bytes Sent By Job
       100352  -  Run Bytes Received By Job
...
001 (646.000.000) 06/23 21:41:04 Job executing on host: <165.134.74.172:1030>
...
010 (646.000.000) 06/24 15:45:12 Job was suspended.
       Number of processes actually suspended: 1
...
011 (646.000.000) 06/24 15:55:16 Job was unsuspended.
...
004 (646.000.000) 06/24 15:55:16 Job was evicted.
       (0) Job was not checkpointed.
               Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
               Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
       0  -  Run Bytes Sent By Job
       100352  -  Run Bytes Received By Job






----------------------------ShadowLog-----------------------------------------



6/22 06:15:18 (671.0) (8484): About to decode condor_sysnum
6/22 06:15:18 (671.0) (8484): condor_read(): recv() returned -1, errno = 104, assuming failure.
6/22 06:15:18 (671.0) (8484): Can no longer talk to condor_starter <165.134.71.141:1035 >
6/22 06:15:18 (671.0) (8484): Trying to reconnect to disconnected job
6/22 06:15:18 (671.0) (8484): LastJobLeaseRenewal: 1182475526 Fri Jun 22 05:55:26 2007
6/22 06:15:18 (671.0) (8484): JobLeaseDuration: 14600 seconds
6/22 06:15:18 (671.0) (8484): JobLeaseDuration remaining: 13408
6/22 06:15:18 (671.0) (8484): Attempting to reconnect to starter <165.134.71.141:2577>
6/22 06:15:18 ( 671.0) (8484): Reconnect SUCCESS: connection re-established
6/22 06:15:18 (671.0) (8484):   StarterIpAddr = <165.134.71.141:2577>
6/22 06:15:18 (671.0) (8484):   UidDomain = 50-1
6/22 06:15:18 (671.0) (8484):   FileSystemDomain = 50-1
6/22 06:15:18 (671.0) (8484):   Machine = 50-1
6/22 06:15:18 (671.0) (8484):   Arch = INTEL
6/22 06:15:18 (671.0) (8484):   OpSys = WINNT51
6/22 06:15:18 ( 671.0) (8484):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/22 06:15:18 (671.0) (8484):   HasReconnect = TRUE
6/22 12:11:31 (678.0) (26675): About to decode condor_sysnum
6/22 12:11:31 (678.0) (26675): condor_read(): recv() returned -1, errno = 104, assuming failure.
6/22 12:11:31 (678.0) (26675): Can no longer talk to condor_starter <165.134.71.106:1036>
6/22 12:11:31 (678.0) (26675): Trying to reconnect to disconnected job
6/22 12:11:31 ( 678.0) (26675): LastJobLeaseRenewal: 1182493305 Fri Jun 22 10:51:45 2007
6/22 12:11:31 (678.0) (26675): JobLeaseDuration: 1200 seconds
6/22 12:11:31 (678.0) (26675): JobLeaseDuration remaining: EXPIRED!
6/22 12:11:31 ( 678.0) (26675): Reconnect FAILED: Job disconnected too long: JobLeaseDuration (1200 seconds) expired
6/22 12:11:31 (678.0) (26675): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/22 12:15:00 (646.0) (2168): About to decode condor_sysnum
6/22 12:15:00 (646.0) (2168): condor_read(): recv() returned -1, errno = 104, assuming failure.
6/22 12:15:00 (646.0) (2168): Can no longer talk to condor_starter <165.134.71.106:1036 >
6/22 12:15:00 (646.0) (2168): Trying to reconnect to disconnected job
6/22 12:15:00 (646.0) (2168): LastJobLeaseRenewal: 1182492310 Fri Jun 22 10:35:10 2007
6/22 12:15:00 (646.0) (2168): JobLeaseDuration: 1200 seconds
6/22 12:15:00 (646.0) (2168): JobLeaseDuration remaining: EXPIRED!
6/22 12:15:00 (646.0) (2168): Reconnect FAILED: Job disconnected too long: JobLeaseDuration (1200 seconds) expired
6/22 12:15:00 (646.0) (2168): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/22 12:15:40 ******************************************************
6/22 12:15:40 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/22 12:15:40 ** /usr/local/condor/sbin/condor_shadow
6/22 12:15:40 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/22 12:15:40 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/22 12:15:40 ** PID = 21691
6/22 12:15:40 ** Log last touched 6/22 12:15:00
6/22 12:15:40 ******************************************************
6/22 12:15:40 Using config source: /home/condor/condor_config
6/22 12:15:40 Using local config sources:
6/22 12:15:40    /home/condor/condor_config.local
6/22 12:15:40 DaemonCore: Command Socket at < 165.134.74.162:35611>
6/22 12:15:40 Initializing a VANILLA shadow for job 646.0
6/22 12:15:40 (646.0) (21691): Request to run on <165.134.71.109:1030> was ACCEPTED
6/22 12:15:41 (646.0) (21691): About to decode condor_sysnum
6/22 12:15:41 (646.0) (21691): Got request for syscall get_job_info (-63)
6/22 12:15:41 (646.0) (21691):  rval = 0, errno = 0
6/22 12:15:41 (646.0) (21691): About to decode condor_sysnum
6/22 12:15:41 (646.0) (21691): Got request for syscall register_starter_info (-77)
6/22 12:15:41 (646.0) (21691):   StarterIpAddr = <165.134.71.109:3865>
6/22 12:15:41 ( 646.0) (21691):   UidDomain = Moshaii
6/22 12:15:41 (646.0) (21691):   FileSystemDomain = Moshaii
6/22 12:15:41 (646.0) (21691):   Machine = Moshaii
6/22 12:15:41 (646.0) (21691):   Arch = INTEL
6/22 12:15:41 ( 646.0) (21691):   OpSys = WINNT50
6/22 12:15:41 (646.0) (21691):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/22 12:15:41 (646.0) (21691):   HasReconnect = TRUE
6/22 12:15:41 (646.0) (21691):  rval = 0, errno = 0
6/22 12:15:42 ******************************************************
6/22 12:15:42 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/22 12:15:42 ** /usr/local/condor/sbin/condor_shadow
6/22 12:15:42 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/22 12:15:42 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/22 12:15:42 ** PID = 21692
6/22 12:15:42 ** Log last touched 6/22 12:15:41
6/22 12:15:42 ******************************************************
6/22 12:15:42 Using config source: /home/condor/condor_config
6/22 12:15:42 Using local config sources:
6/22 12:15:42    /home/condor/condor_config.local
6/22 12:15:42 DaemonCore: Command Socket at < 165.134.74.162:35614>
6/22 12:15:42 Initializing a VANILLA shadow for job 678.0
6/22 12:15:42 (678.0) (21692): Request to run on <165.134.74.30:1034> was ACCEPTED
6/22 12:15:42 (678.0) (21692): About to decode condor_sysnum
6/22 12:15:42 (678.0) (21692): Got request for syscall get_job_info (-63)
6/22 12:15:42 (678.0) (21692):  rval = 0, errno = 0
6/22 12:15:42 (678.0) (21692): About to decode condor_sysnum
6/22 12:15:42 (678.0) (21692): Got request for syscall register_starter_info (-77)
6/22 12:15:42 (678.0) (21692):   StarterIpAddr = <165.134.74.30:3529>
6/22 12:15:42 ( 678.0) (21692):   UidDomain = ipm-7399c481875
6/22 12:15:42 (678.0) (21692):   FileSystemDomain = ipm-7399c481875
6/22 12:15:42 (678.0) (21692):   Machine = vm2@ipm-7399c481875
6/22 12:15:42 (678.0) (21692):   Arch = INTEL
6/22 12:15:42 (678.0) (21692):   OpSys = WINNT51
6/22 12:15:42 (678.0) (21692):   CondorVersion = $CondorVersion: 6.8.4 Feb  1 2007 $
6/22 12:15:42 (678.0) (21692):   HasReconnect = TRUE
6/22 12:15:42 (678.0) (21692):  rval = 0, errno = 0
6/22 12:15:42 (646.0) (21691): About to decode condor_sysnum
6/22 12:15:42 (646.0) (21691): Got request for syscall begin_execution (-78)
6/22 12:15:42 (646.0) (21691):  rval = 0, errno = 0
6/22 12:15:43 (678.0 ) (21692): About to decode condor_sysnum
6/22 12:15:43 (678.0) (21692): Got request for syscall begin_execution (-78)
6/22 12:15:43 (678.0) (21692):  rval = 0, errno = 0
6/22 18:15:18 (671.0) (8484): About to decode condor_sysnum
6/22 18:15:18 (671.0) (8484): condor_read(): recv() returned -1, errno = 104, assuming failure.
6/22 18:15:18 (671.0) (8484): Can no longer talk to condor_starter <165.134.71.141:2577 >
6/22 18:15:18 (671.0) (8484): JobLeaseDuration remaining: 13407
6/22 18:15:18 (671.0) (8484): Attempting to reconnect to starter <165.134.71.141:2577>
6/22 18:15:18 ( 671.0) (8484): Reconnect SUCCESS: connection re-established
6/22 18:15:18 (671.0) (8484):   StarterIpAddr = <165.134.71.141:2577>
6/22 18:15:18 (671.0) (8484):   UidDomain = 50-1
6/22 18:15:18 (671.0) (8484):   FileSystemDomain = 50-1
6/22 18:15:18 (671.0) (8484):   Machine = 50-1
6/22 18:15:18 (671.0) (8484):   Arch = INTEL
6/22 18:15:18 (671.0) (8484):   OpSys = WINNT51
6/22 18:15:18 ( 671.0) (8484):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/22 18:15:18 (671.0) (8484):   HasReconnect = TRUE
6/23 11:22:23 (646.0) (21691): About to decode condor_sysnum
6/23 11:22:23 (646.0) (21691): Got request for syscall job_exit (-65)
6/23 11:22:23 (646.0) (21691): in pseudo_job_exit: status=-1073741510,reason=107
6/23 11:22:23 (646.0) (21691):  rval = 0, errno = 25
6/23 11:22:23 (646.0) (21691): Shadow: do_REMOTE_syscall returned < 0
6/23 11:22:23 (646.0) (21691): Job 646.0 is being evicted
6/23 11:22:23 (646.0) (21691): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/23 11:40:55 ******************************************************
6/23 11:40:55 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/23 11:40:55 ** /usr/local/condor/sbin/condor_shadow
6/23 11:40:55 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 11:40:55 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/23 11:40:55 ** PID = 25429
6/23 11:40:55 ** Log last touched 6/23 11:22:23
6/23 11:40:55 ******************************************************
6/23 11:40:55 Using config source: /home/condor/condor_config
6/23 11:40:55 Using local config sources:
6/23 11:40:55    /home/condor/condor_config.local
6/23 11:40:55 DaemonCore: Command Socket at <165.134.74.162:39356>
6/23 11:40:55 Initializing a VANILLA shadow for job 646.0
6/23 11:40:55 (646.0) (25429): Request to run on <165.134.74.172:1030> was ACCEPTED
6/23 11:40:55 (646.0) (25429): About to decode condor_sysnum
6/23 11:40:55 ( 646.0) (25429): Got request for syscall get_job_info (-63)
6/23 11:40:55 (646.0) (25429):  rval = 0, errno = 0
6/23 11:40:55 (646.0) (25429): About to decode condor_sysnum
6/23 11:40:55 (646.0) (25429): Got request for syscall register_starter_info (-77)
6/23 11:40:55 (646.0) (25429):   StarterIpAddr = <165.134.74.172:4647>
6/23 11:40:55 (646.0) (25429):   UidDomain = arabgol
6/23 11:40:55 (646.0) (25429):   FileSystemDomain = arabgol
6/23 11:40:55 (646.0) (25429):   Machine = arabgol
6/23 11:40:55 (646.0) (25429):   Arch = INTEL
6/23 11:40:55 (646.0) (25429):   OpSys = WINNT51
6/23 11:40:55 (646.0) (25429):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 11:40:55 (646.0) (25429):   HasReconnect = TRUE
6/23 11:40:55 (646.0) (25429):  rval = 0, errno = 0
6/23 11:40:57 (646.0) (25429): About to decode condor_sysnum
6/23 11:40:57 (646.0) (25429): Got request for syscall begin_execution (-78)
6/23 11:40:57 (646.0) (25429):  rval = 0, errno = 0
6/23 14:47:33 (646.0) (25429): About to decode condor_sysnum
6/23 14:47:33 (646.0) (25429): Got request for syscall job_exit (-65)
6/23 14:47:33 (646.0) (25429): in pseudo_job_exit: status=-1073741510,reason=107
6/23 14:47:33 (646.0) (25429):  rval = 0, errno = 25
6/23 14:47:33 (646.0) (25429): Shadow: do_REMOTE_syscall returned < 0
6/23 14:47:33 (646.0) (25429): Job 646.0 is being evicted
6/23 14:47:33 ( 646.0) (25429): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/23 14:55:56 ******************************************************
6/23 14:55:56 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/23 14:55:56 ** /usr/local/condor/sbin/condor_shadow
6/23 14:55:56 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 14:55:56 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/23 14:55:56 ** PID = 26027
6/23 14:55:56 ** Log last touched 6/23 14:47:33
6/23 14:55:56 ******************************************************
6/23 14:55:56 Using config source: /home/condor/condor_config
6/23 14:55:56 Using local config sources:
6/23 14:55:56    /home/condor/condor_config.local
6/23 14:55:56 DaemonCore: Command Socket at < 165.134.74.162:39894>
6/23 14:55:56 Initializing a VANILLA shadow for job 646.0
6/23 14:55:56 (646.0) (26027): Request to run on <165.134.71.184:1045> was REFUSED
6/23 14:55:56 (646.0) (26027): Job 646.0 is being evicted
6/23 14:55:56 (646.0) (26027): logEvictEvent with unknown reason (108), aborting
6/23 14:55:56 (646.0) (26027): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 108
6/23 15:05:57 ******************************************************
6/23 15:05:57 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/23 15:05:57 ** /usr/local/condor/sbin/condor_shadow
6/23 15:05:57 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 15:05:57 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/23 15:05:57 ** PID = 26119
6/23 15:05:57 ** Log last touched 6/23 14:55:56
6/23 15:05:57 ******************************************************
6/23 15:05:57 Using config source: /home/condor/condor_config
6/23 15:05:57 Using local config sources:
6/23 15:05:57    /home/condor/condor_config.local
6/23 15:05:57 DaemonCore: Command Socket at < 165.134.74.162:39930>
6/23 15:05:57 Initializing a VANILLA shadow for job 646.0
6/23 15:05:57 (646.0) (26119): Request to run on <165.134.75.23:1038> was REFUSED
6/23 15:05:57 (646.0) (26119): Job 646.0 is being evicted
6/23 15:05:57 (646.0) (26119): logEvictEvent with unknown reason (108), aborting
6/23 15:05:57 (646.0) (26119): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 108
6/23 15:17:50 (671.0) (8484): About to decode condor_sysnum
6/23 15:17:50 (671.0) (8484): Got request for syscall job_exit (-65)
6/23 15:17:50 (671.0) (8484): in pseudo_job_exit: status=-1073741510,reason=107
6/23 15:17:50 (671.0) (8484):   rval = 0, errno = 25
6/23 15:17:50 (671.0) (8484): Shadow: do_REMOTE_syscall returned < 0
6/23 15:17:50 (671.0) (8484): Job 671.0 is being evicted
6/23 15:17:50 (671.0) (8484): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/23 15:20:57 ******************************************************
6/23 15:20:57 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/23 15:20:57 ** /usr/local/condor/sbin/condor_shadow
6/23 15:20:57 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 15:20:57 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/23 15:20:57 ** PID = 26176
6/23 15:20:57 ** Log last touched 6/23 15:17:50
6/23 15:20:57 ******************************************************
6/23 15:20:57 Using config source: /home/condor/condor_config
6/23 15:20:57 Using local config sources:
6/23 15:20:57    /home/condor/condor_config.local
6/23 15:20:57 DaemonCore: Command Socket at < 165.134.74.162:39981>
6/23 15:20:57 Initializing a VANILLA shadow for job 646.0
6/23 15:20:57 (646.0) (26176): Request to run on <165.134.71.190:1032> was ACCEPTED
6/23 15:21:02 (646.0) (26176): About to decode condor_sysnum
6/23 15:21:02 (646.0) (26176): Got request for syscall get_job_info (-63)
6/23 15:21:02 (646.0) (26176):  rval = 0, errno = 0
6/23 15:21:03 (646.0) (26176): About to decode condor_sysnum
6/23 15:21:03 (646.0) (26176): Got request for syscall register_starter_info (-77)
6/23 15:21:03 (646.0) (26176):   StarterIpAddr = <165.134.71.190:1139>
6/23 15:21:03 ( 646.0) (26176):   UidDomain = Saloon2
6/23 15:21:03 (646.0) (26176):   FileSystemDomain = Saloon2
6/23 15:21:03 (646.0) (26176):   Machine = Saloon2
6/23 15:21:03 (646.0) (26176):   Arch = INTEL
6/23 15:21:03 ( 646.0) (26176):   OpSys = WINNT50
6/23 15:21:03 (646.0) (26176):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 15:21:03 (646.0) (26176):   HasReconnect = TRUE
6/23 15:21:03 (646.0) (26176):  rval = 0, errno = 0
6/23 15:21:06 (646.0) (26176): About to decode condor_sysnum
6/23 15:21:06 (646.0) (26176): Got request for syscall begin_execution (-78)
6/23 15:21:06 (646.0) (26176):  rval = 0, errno = 0
6/23 15:45:57 ******************************************************
6/23 15:45:57 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/23 15:45:57 ** /usr/local/condor/sbin/condor_shadow
6/23 15:45:57 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 15:45:57 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/23 15:45:57 ** PID = 26261
6/23 15:45:57 ** Log last touched 6/23 15:21:06
6/23 15:45:57 ******************************************************
6/23 15:45:57 Using config source: /home/condor/condor_config
6/23 15:45:57 Using local config sources:
6/23 15:45:57    /home/condor/condor_config.local
6/23 15:45:57 DaemonCore: Command Socket at <165.134.74.162:40055>
6/23 15:45:57 Initializing a VANILLA shadow for job 671.0
6/23 15:45:57 (671.0) (26261): Request to run on <165.134.71.189:1044> was REFUSED
6/23 15:45:57 (671.0) (26261): Job 671.0 is being evicted
6/23 15:45:57 (671.0 ) (26261): logEvictEvent with unknown reason (108), aborting
6/23 15:45:57 (671.0) (26261): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 108
6/23 16:15:58 ******************************************************
6/23 16:15:58 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/23 16:15:58 ** /usr/local/condor/sbin/condor_shadow
6/23 16:15:58 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 16:15:58 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/23 16:15:58 ** PID = 26367
6/23 16:15:58 ** Log last touched 6/23 15:45:57
6/23 16:15:58 ******************************************************
6/23 16:15:58 Using config source: /home/condor/condor_config
6/23 16:15:58 Using local config sources:
6/23 16:15:58    /home/condor/condor_config.local
6/23 16:15:58 DaemonCore: Command Socket at <165.134.74.162:40145>
6/23 16:15:58 Initializing a VANILLA shadow for job 671.0
6/23 16:15:58 (671.0) (26367): Request to run on <165.134.71.184:1045> was REFUSED
6/23 16:15:58 (671.0) (26367): Job 671.0 is being evicted
6/23 16:15:58 (671.0 ) (26367): logEvictEvent with unknown reason (108), aborting
6/23 16:15:58 (671.0) (26367): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 108
6/23 16:20:58 ******************************************************
6/23 16:20:58 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/23 16:20:58 ** /usr/local/condor/sbin/condor_shadow
6/23 16:20:58 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 16:20:58 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/23 16:20:58 ** PID = 26387
6/23 16:20:58 ** Log last touched 6/23 16:15:58
6/23 16:20:58 ******************************************************
6/23 16:20:58 Using config source: /home/condor/condor_config
6/23 16:20:58 Using local config sources:
6/23 16:20:58    /home/condor/condor_config.local
6/23 16:20:58 DaemonCore: Command Socket at <165.134.74.162:40164>
6/23 16:20:58 Initializing a VANILLA shadow for job 671.0
6/23 16:20:58 (671.0) (26387): Request to run on <165.134.71.189:1044> was ACCEPTED
6/23 16:20:59 (671.0) (26387): About to decode condor_sysnum
6/23 16:20:59 ( 671.0) (26387): Got request for syscall get_job_info (-63)
6/23 16:20:59 (671.0) (26387):  rval = 0, errno = 0
6/23 16:20:59 (671.0) (26387): About to decode condor_sysnum
6/23 16:20:59 (671.0) (26387): Got request for syscall register_starter_info (-77)
6/23 16:20:59 (671.0) (26387):   StarterIpAddr = <165.134.71.189:2779>
6/23 16:20:59 (671.0) (26387):   UidDomain = ayazi
6/23 16:20:59 (671.0) (26387):   FileSystemDomain = ayazi
6/23 16:20:59 (671.0) (26387):   Machine = vm1@ayazi
6/23 16:20:59 (671.0) (26387):   Arch = INTEL
6/23 16:20:59 (671.0) (26387):   OpSys = WINNT51
6/23 16:20:59 (671.0) (26387):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 16:20:59 (671.0) (26387):   HasReconnect = TRUE
6/23 16:20:59 (671.0) (26387):  rval = 0, errno = 0
6/23 16:21:01 (671.0) (26387): About to decode condor_sysnum
6/23 16:21:01 (671.0) (26387): Got request for syscall begin_execution (-78)
6/23 16:21:01 (671.0) (26387):  rval = 0, errno = 0
6/23 16:31:46 (671.0) (26387): About to decode condor_sysnum
6/23 16:31:46 (671.0) (26387): Got request for syscall job_exit (-65)
6/23 16:31:46 (671.0) (26387): in pseudo_job_exit: status=-1073741510,reason=107
6/23 16:31:46 (671.0) (26387):  rval = 0, errno = 25
6/23 16:31:46 (671.0) (26387): Shadow: do_REMOTE_syscall returned < 0
6/23 16:31:46 (671.0) (26387): Job 671.0 is being evicted
6/23 16:31:46 (671.0) (26387): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/23 17:50:59 ******************************************************
6/23 17:50:59 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/23 17:50:59 ** /usr/local/condor/sbin/condor_shadow
6/23 17:50:59 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 17:50:59 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/23 17:50:59 ** PID = 26678
6/23 17:50:59 ** Log last touched 6/23 16:31:46
6/23 17:50:59 ******************************************************
6/23 17:50:59 Using config source: /home/condor/condor_config
6/23 17:50:59 Using local config sources:
6/23 17:50:59    /home/condor/condor_config.local
6/23 17:50:59 DaemonCore: Command Socket at < 165.134.74.162:40418>
6/23 17:50:59 Initializing a VANILLA shadow for job 671.0
6/23 17:50:59 (671.0) (26678): Request to run on <165.134.71.189:1044> was ACCEPTED
6/23 17:51:00 (671.0) (26678): About to decode condor_sysnum
6/23 17:51:00 (671.0) (26678): Got request for syscall get_job_info (-63)
6/23 17:51:00 (671.0) (26678):  rval = 0, errno = 0
6/23 17:51:00 (671.0) (26678): About to decode condor_sysnum
6/23 17:51:00 (671.0) (26678): Got request for syscall register_starter_info (-77)
6/23 17:51:00 (671.0) (26678):   StarterIpAddr = <165.134.71.189:3205>
6/23 17:51:00 ( 671.0) (26678):   UidDomain = ayazi
6/23 17:51:00 (671.0) (26678):   FileSystemDomain = ayazi
6/23 17:51:00 (671.0) (26678):   Machine = vm1@ayazi
6/23 17:51:00 (671.0) (26678):   Arch = INTEL
6/23 17:51:00 (671.0 ) (26678):   OpSys = WINNT51
6/23 17:51:00 (671.0) (26678):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 17:51:00 (671.0) (26678):   HasReconnect = TRUE
6/23 17:51:00 (671.0) (26678):  rval = 0, errno = 0
6/23 17:51:02 (671.0) (26678): About to decode condor_sysnum
6/23 17:51:02 (671.0) (26678): Got request for syscall begin_execution (-78)
6/23 17:51:02 (671.0) (26678):  rval = 0, errno = 0
6/23 17:51:54 (646.0) (26176): About to decode condor_sysnum
6/23 17:51:54 (646.0) (26176): Got request for syscall job_exit (-65)
6/23 17:51:54 (646.0) (26176): in pseudo_job_exit: status=-1073741510,reason=107
6/23 17:51:54 (646.0) (26176):  rval = 0, errno = 25
6/23 17:51:54 ( 646.0) (26176): Shadow: do_REMOTE_syscall returned < 0
6/23 17:51:54 (646.0) (26176): Job 646.0 is being evicted
6/23 17:51:54 (646.0) (26176): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/23 18:00:59 ******************************************************
6/23 18:00:59 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/23 18:00:59 ** /usr/local/condor/sbin/condor_shadow
6/23 18:00:59 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 18:00:59 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/23 18:00:59 ** PID = 26707
6/23 18:00:59 ** Log last touched 6/23 17:51:54
6/23 18:00:59 ******************************************************
6/23 18:00:59 Using config source: /home/condor/condor_config
6/23 18:00:59 Using local config sources:
6/23 18:00:59    /home/condor/condor_config.local
6/23 18:00:59 DaemonCore: Command Socket at <165.134.74.162:40452>
6/23 18:00:59 Initializing a VANILLA shadow for job 646.0
6/23 18:00:59 (646.0) (26707): Request to run on <165.134.71.189:1044> was ACCEPTED
6/23 18:00:59 (646.0) (26707): About to decode condor_sysnum
6/23 18:00:59 ( 646.0) (26707): Got request for syscall get_job_info (-63)
6/23 18:00:59 (646.0) (26707):  rval = 0, errno = 0
6/23 18:00:59 (646.0) (26707): About to decode condor_sysnum
6/23 18:00:59 (646.0) (26707): Got request for syscall register_starter_info (-77)
6/23 18:00:59 (646.0) (26707):   StarterIpAddr = <165.134.71.189:3238>
6/23 18:00:59 (646.0) (26707):   UidDomain = ayazi
6/23 18:00:59 (646.0) (26707):   FileSystemDomain = ayazi
6/23 18:00:59 (646.0) (26707):   Machine = vm2@ayazi
6/23 18:00:59 (646.0) (26707):   Arch = INTEL
6/23 18:00:59 (646.0) (26707):   OpSys = WINNT51
6/23 18:00:59 (646.0) (26707):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 18:00:59 (646.0) (26707):   HasReconnect = TRUE
6/23 18:00:59 (646.0) (26707):  rval = 0, errno = 0
6/23 18:01:01 (646.0) (26707): About to decode condor_sysnum
6/23 18:01:01 (646.0) (26707): Got request for syscall begin_execution (-78)
6/23 18:01:01 (646.0) (26707):  rval = 0, errno = 0
6/23 18:33:14 (646.0) (26707): About to decode condor_sysnum
6/23 18:33:14 (646.0) (26707): Got request for syscall job_exit (-65)
6/23 18:33:14 (646.0) (26707): in pseudo_job_exit: status=-1073741510,reason=107
6/23 18:33:14 (646.0) (26707):  rval = 0, errno = 25
6/23 18:33:14 (646.0) (26707): Shadow: do_REMOTE_syscall returned < 0
6/23 18:33:14 (646.0) (26707): Job 646.0 is being evicted
6/23 18:33:14 ( 646.0) (26707): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/23 18:33:14 (671.0) (26678): About to decode condor_sysnum
6/23 18:33:14 (671.0) (26678): Got request for syscall job_exit (-65)
6/23 18:33:14 (671.0) (26678): in pseudo_job_exit: status=-1073741510,reason=107
6/23 18:33:14 (671.0) (26678):  rval = 0, errno = 25
6/23 18:33:14 (671.0) (26678): Shadow: do_REMOTE_syscall returned < 0
6/23 18:33:14 ( 671.0) (26678): Job 671.0 is being evicted
6/23 18:33:14 (671.0) (26678): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/23 19:41:00 ******************************************************
6/23 19:41:00 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/23 19:41:00 ** /usr/local/condor/sbin/condor_shadow
6/23 19:41:00 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 19:41:00 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/23 19:41:00 ** PID = 26977
6/23 19:41:00 ** Log last touched 6/23 18:33:14
6/23 19:41:00 ******************************************************
6/23 19:41:00 Using config source: /home/condor/condor_config
6/23 19:41:00 Using local config sources:
6/23 19:41:00    /home/condor/condor_config.local
6/23 19:41:00 DaemonCore: Command Socket at <165.134.74.162:40743>
6/23 19:41:00 Initializing a VANILLA shadow for job 671.0
6/23 19:41:00 (671.0) (26977): Request to run on < 165.134.71.141:1035> was ACCEPTED
6/23 19:41:01 (671.0) (26977): About to decode condor_sysnum
6/23 19:41:01 (671.0) (26977): Got request for syscall get_job_info (-63)
6/23 19:41:01 (671.0) (26977):  rval = 0, errno = 0
6/23 19:41:01 (671.0) (26977): About to decode condor_sysnum
6/23 19:41:01 (671.0) (26977): Got request for syscall register_starter_info (-77)
6/23 19:41:01 (671.0 ) (26977):   StarterIpAddr = <165.134.71.141:1843>
6/23 19:41:01 (671.0) (26977):   UidDomain = 50-1
6/23 19:41:01 (671.0) (26977):   FileSystemDomain = 50-1
6/23 19:41:01 ( 671.0) (26977):   Machine = 50-1
6/23 19:41:01 (671.0) (26977):   Arch = INTEL
6/23 19:41:01 (671.0) (26977):   OpSys = WINNT51
6/23 19:41:01 (671.0) (26977):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 19:41:01 (671.0) (26977):   HasReconnect = TRUE
6/23 19:41:01 (671.0) (26977):  rval = 0, errno = 0
6/23 19:41:03 (671.0) (26977): About to decode condor_sysnum
6/23 19:41:03 (671.0) (26977): Got request for syscall begin_execution (-78)
6/23 19:41:03 (671.0) (26977):  rval = 0, errno = 0
6/23 21:41:02 ******************************************************
6/23 21:41:02 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/23 21:41:02 ** /usr/local/condor/sbin/condor_shadow
6/23 21:41:02 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 21:41:02 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/23 21:41:02 ** PID = 27325
6/23 21:41:02 ** Log last touched 6/23 19:41:03
6/23 21:41:02 ******************************************************
6/23 21:41:02 Using config source: /home/condor/condor_config
6/23 21:41:02 Using local config sources:
6/23 21:41:02    /home/condor/condor_config.local
6/23 21:41:02 DaemonCore: Command Socket at < 165.134.74.162:41085>
6/23 21:41:02 Initializing a VANILLA shadow for job 646.0
6/23 21:41:02 (646.0) (27325): Request to run on <165.134.74.172:1030> was ACCEPTED
6/23 21:41:02 (646.0) (27325): About to decode condor_sysnum
6/23 21:41:02 (646.0) (27325): Got request for syscall get_job_info (-63)
6/23 21:41:02 (646.0) (27325):  rval = 0, errno = 0
6/23 21:41:02 (646.0) (27325): About to decode condor_sysnum
6/23 21:41:02 (646.0) (27325): Got request for syscall register_starter_info (-77)
6/23 21:41:02 (646.0) (27325):   StarterIpAddr = <165.134.74.172:2778>
6/23 21:41:02 ( 646.0) (27325):   UidDomain = arabgol
6/23 21:41:02 (646.0) (27325):   FileSystemDomain = arabgol
6/23 21:41:02 (646.0) (27325):   Machine = arabgol
6/23 21:41:02 (646.0) (27325):   Arch = INTEL
6/23 21:41:02 ( 646.0) (27325):   OpSys = WINNT51
6/23 21:41:02 (646.0) (27325):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/23 21:41:02 (646.0) (27325):   HasReconnect = TRUE
6/23 21:41:02 (646.0) (27325):  rval = 0, errno = 0
6/23 21:41:04 (646.0) (27325): About to decode condor_sysnum
6/23 21:41:04 (646.0) (27325): Got request for syscall begin_execution (-78)
6/23 21:41:04 (646.0) (27325):  rval = 0, errno = 0
6/23 21:42:02 (671.0 ) (26977): About to decode condor_sysnum
6/23 21:42:02 (671.0) (26977): Got request for syscall job_exit (-65)
6/23 21:42:02 (671.0) (26977): in pseudo_job_exit: status=-1073741510,reason=107
6/23 21:42:02 (671.0 ) (26977):  rval = 0, errno = 25
6/23 21:42:02 (671.0) (26977): Shadow: do_REMOTE_syscall returned < 0
6/23 21:42:02 (671.0) (26977): Job 671.0 is being evicted
6/23 21:42:02 (671.0) (26977): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/24 00:46:04 ******************************************************
6/24 00:46:04 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/24 00:46:04 ** /usr/local/condor/sbin/condor_shadow
6/24 00:46:04 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/24 00:46:04 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/24 00:46:04 ** PID = 27770
6/24 00:46:04 ** Log last touched 6/23 21:42:02
6/24 00:46:04 ******************************************************
6/24 00:46:04 Using config source: /home/condor/condor_config
6/24 00:46:04 Using local config sources:
6/24 00:46:04    /home/condor/condor_config.local
6/24 00:46:04 DaemonCore: Command Socket at < 165.134.74.162:41613>
6/24 00:46:04 Initializing a VANILLA shadow for job 671.0
6/24 00:46:04 (671.0) (27770): Request to run on <165.134.71.141:1035> was ACCEPTED
6/24 00:46:04 (671.0) (27770): About to decode condor_sysnum
6/24 00:46:04 (671.0) (27770): Got request for syscall get_job_info (-63)
6/24 00:46:04 (671.0) (27770):  rval = 0, errno = 0
6/24 00:46:04 (671.0) (27770): About to decode condor_sysnum
6/24 00:46:04 (671.0) (27770): Got request for syscall register_starter_info (-77)
6/24 00:46:04 (671.0) (27770):   StarterIpAddr = <165.134.71.141:3193>
6/24 00:46:04 ( 671.0) (27770):   UidDomain = 50-1
6/24 00:46:04 (671.0) (27770):   FileSystemDomain = 50-1
6/24 00:46:04 (671.0) (27770):   Machine = 50-1
6/24 00:46:04 (671.0) (27770):   Arch = INTEL
6/24 00:46:04 (671.0) (27770):   OpSys = WINNT51
6/24 00:46:04 (671.0) (27770):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/24 00:46:04 (671.0) (27770):   HasReconnect = TRUE
6/24 00:46:04 (671.0) (27770):  rval = 0, errno = 0
6/24 00:46:06 (671.0 ) (27770): About to decode condor_sysnum
6/24 00:46:06 (671.0) (27770): Got request for syscall begin_execution (-78)
6/24 00:46:06 (671.0) (27770):  rval = 0, errno = 0
6/24 10:43:57 (671.0) (27770): About to decode condor_sysnum
6/24 10:43:57 (671.0) (27770): Got request for syscall job_exit (-65)
6/24 10:43:57 (671.0) (27770): in pseudo_job_exit: status=-1073741510,reason=107
6/24 10:43:57 (671.0) (27770):  rval = 0, errno = 25
6/24 10:43:57 ( 671.0) (27770): Shadow: do_REMOTE_syscall returned < 0
6/24 10:43:57 (671.0) (27770): Job 671.0 is being evicted
6/24 10:43:57 (671.0) (27770): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/24 12:16:10 ******************************************************
6/24 12:16:10 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/24 12:16:10 ** /usr/local/condor/sbin/condor_shadow
6/24 12:16:10 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/24 12:16:10 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/24 12:16:10 ** PID = 29834
6/24 12:16:10 ** Log last touched 6/24 10:43:57
6/24 12:16:10 ******************************************************
6/24 12:16:10 Using config source: /home/condor/condor_config
6/24 12:16:10 Using local config sources:
6/24 12:16:10    /home/condor/condor_config.local
6/24 12:16:10 DaemonCore: Command Socket at <165.134.74.162:43468>
6/24 12:16:10 Initializing a VANILLA shadow for job 671.0
6/24 12:16:11 (671.0) (29834): Request to run on <165.134.71.141:1035> was ACCEPTED
6/24 12:16:12 (671.0) (29834): About to decode condor_sysnum
6/24 12:16:12 ( 671.0) (29834): Got request for syscall get_job_info (-63)
6/24 12:16:12 (671.0) (29834):  rval = 0, errno = 0
6/24 12:16:12 (671.0) (29834): About to decode condor_sysnum
6/24 12:16:12 (671.0) (29834): Got request for syscall register_starter_info (-77)
6/24 12:16:12 (671.0) (29834):   StarterIpAddr = <165.134.71.141:1045>
6/24 12:16:12 (671.0) (29834):   UidDomain = 50-1
6/24 12:16:12 (671.0) (29834):   FileSystemDomain = 50-1
6/24 12:16:12 (671.0) (29834):   Machine = 50-1
6/24 12:16:12 (671.0) (29834):   Arch = INTEL
6/24 12:16:12 (671.0) (29834):   OpSys = WINNT51
6/24 12:16:12 (671.0) (29834):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/24 12:16:12 (671.0) (29834):   HasReconnect = TRUE
6/24 12:16:12 (671.0) (29834):  rval = 0, errno = 0
6/24 12:16:14 (671.0) (29834): About to decode condor_sysnum
6/24 12:16:14 (671.0) (29834): Got request for syscall begin_execution (-78)
6/24 12:16:14 (671.0) (29834):  rval = 0, errno = 0
6/24 12:47:27 (671.0) (29834): About to decode condor_sysnum
6/24 12:47:27 (671.0) (29834): Got request for syscall job_exit (-65)
6/24 12:47:27 (671.0) (29834): in pseudo_job_exit: status=-1073741510,reason=107
6/24 12:47:27 (671.0) (29834):  rval = 0, errno = 25
6/24 12:47:27 (671.0) (29834): Shadow: do_REMOTE_syscall returned < 0
6/24 12:47:27 (671.0) (29834): Job 671.0 is being evicted
6/24 12:47:27 (671.0) (29834): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/24 15:55:16 (646.0) (27325): About to decode condor_sysnum
6/24 15:55:16 ( 646.0) (27325): Got request for syscall job_exit (-65)
6/24 15:55:16 (646.0) (27325): in pseudo_job_exit: status=-1073741510,reason=107
6/24 15:55:16 (646.0) (27325):  rval = 0, errno = 25
6/24 15:55:16 (646.0) (27325): Shadow: do_REMOTE_syscall returned < 0
6/24 15:55:16 (646.0) (27325): Job 646.0 is being evicted
6/24 15:55:16 (646.0) (27325): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/24 15:56:15 ******************************************************
6/24 15:56:15 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/24 15:56:15 ** /usr/local/condor/sbin/condor_shadow
6/24 15:56:15 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/24 15:56:15 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/24 15:56:15 ** PID = 30484
6/24 15:56:15 ** Log last touched 6/24 15:55:16
6/24 15:56:15 ******************************************************
6/24 15:56:15 Using config source: /home/condor/condor_config
6/24 15:56:15 Using local config sources:
6/24 15:56:15    /home/condor/condor_config.local
6/24 15:56:15 DaemonCore: Command Socket at <165.134.74.162:44095>
6/24 15:56:15 Initializing a VANILLA shadow for job 671.0
6/24 15:56:15 (671.0) (30484): Request to run on <165.134.71.184:1030> was ACCEPTED
6/24 15:56:15 (671.0) (30484): About to decode condor_sysnum
6/24 15:56:15 ( 671.0) (30484): Got request for syscall get_job_info (-63)
6/24 15:56:15 (671.0) (30484):  rval = 0, errno = 0
6/24 15:56:15 (671.0) (30484): About to decode condor_sysnum
6/24 15:56:15 (671.0) (30484): Got request for syscall register_starter_info (-77)
6/24 15:56:15 (671.0) (30484):   StarterIpAddr = <165.134.71.184:1394>
6/24 15:56:15 (671.0) (30484):   UidDomain = alishahiha
6/24 15:56:15 (671.0) (30484):   FileSystemDomain = alishahiha
6/24 15:56:15 (671.0) (30484):   Machine = alishahiha
6/24 15:56:15 (671.0) (30484):   Arch = INTEL
6/24 15:56:15 (671.0) (30484):   OpSys = WINNT51
6/24 15:56:15 (671.0) (30484):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/24 15:56:15 (671.0) (30484):   HasReconnect = TRUE
6/24 15:56:15 (671.0) (30484):  rval = 0, errno = 0
6/24 15:56:18 (671.0) (30484): About to decode condor_sysnum
6/24 15:56:18 (671.0) (30484): Got request for syscall begin_execution (-78)
6/24 15:56:18 (671.0) (30484):  rval = 0, errno = 0
6/24 16:01:15 ******************************************************
6/24 16:01:15 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/24 16:01:15 ** /usr/local/condor/sbin/condor_shadow
6/24 16:01:15 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/24 16:01:15 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/24 16:01:15 ** PID = 30504
6/24 16:01:15 ** Log last touched 6/24 15:56:18
6/24 16:01:15 ******************************************************
6/24 16:01:15 Using config source: /home/condor/condor_config
6/24 16:01:15 Using local config sources:
6/24 16:01:15    /home/condor/condor_config.local
6/24 16:01:15 DaemonCore: Command Socket at < 165.134.74.162:44115>
6/24 16:01:15 Initializing a VANILLA shadow for job 646.0
6/24 16:01:15 (646.0) (30504): Request to run on <165.134.71.190:1031> was ACCEPTED
6/24 16:01:17 (646.0) (30504): About to decode condor_sysnum
6/24 16:01:17 (646.0) (30504): Got request for syscall get_job_info (-63)
6/24 16:01:17 (646.0) (30504):  rval = 0, errno = 0
6/24 16:01:17 (646.0) (30504): About to decode condor_sysnum
6/24 16:01:17 (646.0) (30504): Got request for syscall register_starter_info (-77)
6/24 16:01:17 (646.0) (30504):   StarterIpAddr = <165.134.71.190:1751>
6/24 16:01:17 ( 646.0) (30504):   UidDomain = Saloon2
6/24 16:01:17 (646.0) (30504):   FileSystemDomain = Saloon2
6/24 16:01:17 (646.0) (30504):   Machine = Saloon2
6/24 16:01:17 (646.0) (30504):   Arch = INTEL
6/24 16:01:17 ( 646.0) (30504):   OpSys = WINNT50
6/24 16:01:17 (646.0) (30504):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/24 16:01:17 (646.0) (30504):   HasReconnect = TRUE
6/24 16:01:17 (646.0) (30504):  rval = 0, errno = 0
6/24 16:01:20 (646.0) (30504): About to decode condor_sysnum
6/24 16:01:20 (646.0) (30504): Got request for syscall begin_execution (-78)
6/24 16:01:20 (646.0) (30504):  rval = 0, errno = 0
6/24 16:03:48 (671.0 ) (30484): About to decode condor_sysnum
6/24 16:03:48 (671.0) (30484): Got request for syscall job_exit (-65)
6/24 16:03:48 (671.0) (30484): in pseudo_job_exit: status=0,reason=107
6/24 16:03:48 (671.0) (30484):  rval = 0, errno = 25
6/24 16:03:48 (671.0) (30484): Shadow: do_REMOTE_syscall returned < 0
6/24 16:03:48 (671.0) (30484): Job 671.0 is being evicted
6/24 16:03:48 (671.0) (30484): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/24 17:24:16 (678.0) (21692): About to decode condor_sysnum
6/24 17:24:16 (678.0) (21692): Got request for syscall job_exit (-65)
6/24 17:24:16 (678.0) (21692): in pseudo_job_exit: status=-1073741510,reason=107
6/24 17:24:16 (678.0) (21692):  rval = 0, errno = 25
6/24 17:24:16 (678.0) (21692): Shadow: do_REMOTE_syscall returned < 0
6/24 17:24:16 (678.0) (21692): Job 678.0 is being evicted
6/24 17:24:16 (678.0) (21692): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/24 17:24:20 ******************************************************
6/24 17:24:20 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/24 17:24:20 ** /usr/local/condor/sbin/condor_shadow
6/24 17:24:20 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/24 17:24:20 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/24 17:24:20 ** PID = 30890
6/24 17:24:20 ** Log last touched 6/24 17:24:16
6/24 17:24:20 ******************************************************
6/24 17:24:20 Using config source: /home/condor/condor_config
6/24 17:24:20 Using local config sources:
6/24 17:24:20    /home/condor/condor_config.local
6/24 17:24:20 DaemonCore: Command Socket at < 165.134.74.162:44363>
6/24 17:24:20 Initializing a VANILLA shadow for job 691.0
6/24 17:24:20 (691.0) (30890): Request to run on <165.134.74.30:1034> was ACCEPTED
6/24 17:24:20 (691.0) (30890): About to decode condor_sysnum
6/24 17:24:20 (691.0) (30890): Got request for syscall get_job_info (-63)
6/24 17:24:20 (691.0) (30890):  rval = 0, errno = 0
6/24 17:24:20 (691.0) (30890): About to decode condor_sysnum
6/24 17:24:20 (691.0) (30890): Got request for syscall register_starter_info (-77)
6/24 17:24:20 (691.0) (30890):   StarterIpAddr = <165.134.74.30:1399>
6/24 17:24:20 ( 691.0) (30890):   UidDomain = ipm-7399c481875
6/24 17:24:20 (691.0) (30890):   FileSystemDomain = ipm-7399c481875
6/24 17:24:20 (691.0) (30890):   Machine = vm2@ipm-7399c481875
6/24 17:24:20 (691.0) (30890):   Arch = INTEL
6/24 17:24:20 (691.0) (30890):   OpSys = WINNT51
6/24 17:24:20 (691.0) (30890):   CondorVersion = $CondorVersion: 6.8.4 Feb  1 2007 $
6/24 17:24:20 (691.0) (30890):   HasReconnect = TRUE
6/24 17:24:20 (691.0) (30890):  rval = 0, errno = 0
6/24 17:24:21 (691.0) (30890): About to decode condor_sysnum
6/24 17:24:21 (691.0) (30890): Got request for syscall begin_execution (-78)
6/24 17:24:21 (691.0) (30890):  rval = 0, errno = 0
6/24 17:24:21 (691.0 ) (30890): About to decode condor_sysnum
6/24 17:24:21 (691.0) (30890): Got request for syscall job_exit (-65)
6/24 17:24:21 (691.0) (30890): in pseudo_job_exit: status=0,reason=100
6/24 17:24:21 (691.0) (30890):  rval = 0, errno = 25
6/24 17:24:21 (691.0) (30890): Shadow: do_REMOTE_syscall returned < 0
6/24 17:24:21 (691.0) (30890): Job 691.0 terminated: exited with status 0
6/24 17:24:21 (691.0) (30890): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 100
6/24 17:29:20 ******************************************************
6/24 17:29:20 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/24 17:29:20 ** /usr/local/condor/sbin/condor_shadow
6/24 17:29:20 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/24 17:29:20 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/24 17:29:20 ** PID = 30912
6/24 17:29:20 ** Log last touched 6/24 17:24:21
6/24 17:29:20 ******************************************************
6/24 17:29:20 Using config source: /home/condor/condor_config
6/24 17:29:20 Using local config sources:
6/24 17:29:20    /home/condor/condor_config.local
6/24 17:29:20 DaemonCore: Command Socket at < 165.134.74.162:44384>
6/24 17:29:20 Initializing a VANILLA shadow for job 671.0
6/24 17:29:20 (671.0) (30912): Request to run on <165.134.74.30:1034> was ACCEPTED
6/24 17:29:20 (671.0) (30912): About to decode condor_sysnum
6/24 17:29:20 (671.0) (30912): Got request for syscall get_job_info (-63)
6/24 17:29:20 (671.0) (30912):  rval = 0, errno = 0
6/24 17:29:20 (671.0) (30912): About to decode condor_sysnum
6/24 17:29:20 (671.0) (30912): Got request for syscall register_starter_info (-77)
6/24 17:29:20 (671.0) (30912):   StarterIpAddr = <165.134.74.30:1426>
6/24 17:29:20 ( 671.0) (30912):   UidDomain = ipm-7399c481875
6/24 17:29:20 (671.0) (30912):   FileSystemDomain = ipm-7399c481875
6/24 17:29:20 (671.0) (30912):   Machine = vm2@ipm-7399c481875
6/24 17:29:20 (671.0) (30912):   Arch = INTEL
6/24 17:29:20 (671.0) (30912):   OpSys = WINNT51
6/24 17:29:20 (671.0) (30912):   CondorVersion = $CondorVersion: 6.8.4 Feb  1 2007 $
6/24 17:29:20 (671.0) (30912):   HasReconnect = TRUE
6/24 17:29:20 (671.0) (30912):  rval = 0, errno = 0
6/24 17:29:22 (671.0) (30912): About to decode condor_sysnum
6/24 17:29:22 (671.0) (30912): Got request for syscall begin_execution (-78)
6/24 17:29:22 (671.0) (30912):  rval = 0, errno = 0
6/24 17:49:20 ******************************************************
6/24 17:49:20 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/24 17:49:20 ** /usr/local/condor/sbin/condor_shadow
6/24 17:49:20 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/24 17:49:20 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/24 17:49:20 ** PID = 30966
6/24 17:49:20 ** Log last touched 6/24 17:29:22
6/24 17:49:20 ******************************************************
6/24 17:49:20 Using config source: /home/condor/condor_config
6/24 17:49:20 Using local config sources:
6/24 17:49:20    /home/condor/condor_config.local
6/24 17:49:20 DaemonCore: Command Socket at <165.134.74.162:44445>
6/24 17:49:20 Initializing a VANILLA shadow for job 678.0
6/24 17:49:20 (678.0) (30966): Request to run on <165.134.71.189:1046> was ACCEPTED
6/24 17:49:22 (678.0) (30966): About to decode condor_sysnum
6/24 17:49:22 ( 678.0) (30966): Got request for syscall get_job_info (-63)
6/24 17:49:22 (678.0) (30966):  rval = 0, errno = 0
6/24 17:49:22 (678.0) (30966): About to decode condor_sysnum
6/24 17:49:22 (678.0) (30966): Got request for syscall register_starter_info (-77)
6/24 17:49:22 (678.0) (30966):   StarterIpAddr = <165.134.71.189:3197>
6/24 17:49:22 (678.0) (30966):   UidDomain = ayazi
6/24 17:49:22 (678.0) (30966):   FileSystemDomain = ayazi
6/24 17:49:22 (678.0) (30966):   Machine = vm1@ayazi
6/24 17:49:22 (678.0) (30966):   Arch = INTEL
6/24 17:49:22 (678.0) (30966):   OpSys = WINNT51
6/24 17:49:22 (678.0) (30966):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/24 17:49:22 (678.0) (30966):   HasReconnect = TRUE
6/24 17:49:22 (678.0) (30966):  rval = 0, errno = 0
6/24 17:49:24 (678.0) (30966): About to decode condor_sysnum
6/24 17:49:24 (678.0) (30966): Got request for syscall begin_execution (-78)
6/24 17:49:24 (678.0) (30966):  rval = 0, errno = 0
6/24 20:00:39 (678.0) (30966): About to decode condor_sysnum
6/24 20:00:39 (678.0) (30966): condor_read(): recv() returned -1, errno = 110, assuming failure.
6/24 20:00:39 (678.0) (30966): Can no longer talk to condor_starter <165.134.71.189:1046>
6/24 20:00:39 (678.0) (30966): Trying to reconnect to disconnected job
6/24 20:00:39 ( 678.0) (30966): LastJobLeaseRenewal: 1182698063 Sun Jun 24 19:44:23 2007
6/24 20:00:39 (678.0) (30966): JobLeaseDuration: 1200 seconds
6/24 20:00:39 (678.0) (30966): JobLeaseDuration remaining: 224
6/24 20:00:39 ( 678.0) (30966): Attempting to reconnect to starter <165.134.71.189:3197>
6/24 20:01:09 (678.0) (30966): select returns 0, connect failed
6/24 20:01:09 (678.0) (30966): Will keep trying for 30 seconds...
6/24 20:01:10 (678.0) (30966): Connect failed for 31 seconds; returning FALSE
6/24 20:01:10 (678.0) (30966): Attempt to reconnect failed: Failed to connect to starter <165.134.71.189:3197 >
6/24 20:01:10 (678.0) (30966): JobLeaseDuration remaining: 1169
6/24 20:01:10 (678.0) (30966): Scheduling another attempt to reconnect in 8 seconds
6/24 20:01:18 (678.0) (30966): Attempting to reconnect to starter < 165.134.71.189:3197>
6/24 20:01:48 (678.0) (30966): select returns 0, connect failed
6/24 20:01:48 (678.0) (30966): Will keep trying for 30 seconds...
6/24 20:01:49 (678.0 ) (30966): Connect failed for 31 seconds; returning FALSE
6/24 20:01:49 (678.0) (30966): Attempt to reconnect failed: Failed to connect to starter <165.134.71.189:3197>
6/24 20:01:49 (678.0) (30966): JobLeaseDuration remaining: 1130
6/24 20:01:49 (678.0) (30966): Scheduling another attempt to reconnect in 16 seconds
6/24 20:02:05 (678.0) (30966): Attempting to reconnect to starter < 165.134.71.189:3197>
6/24 20:02:35 (678.0) (30966): select returns 0, connect failed
6/24 20:02:35 (678.0) (30966): Will keep trying for 30 seconds...
6/24 20:02:36 (678.0 ) (30966): Connect failed for 31 seconds; returning FALSE
6/24 20:02:36 (678.0) (30966): Attempt to reconnect failed: Failed to connect to starter <165.134.71.189:3197>
6/24 20:02:36 (678.0) (30966): JobLeaseDuration remaining: 1083
6/24 20:02:36 (678.0) (30966): Scheduling another attempt to reconnect in 32 seconds
6/24 20:03:08 (678.0) (30966): Attempting to reconnect to starter < 165.134.71.189:3197>
6/24 20:03:38 (678.0) (30966): select returns 0, connect failed
6/24 20:03:38 (678.0) (30966): Will keep trying for 30 seconds...
6/24 20:03:39 (678.0 ) (30966): Connect failed for 31 seconds; returning FALSE
6/24 20:03:39 (678.0) (30966): Attempt to reconnect failed: Failed to connect to starter <165.134.71.189:3197>
6/24 20:03:39 (678.0) (30966): JobLeaseDuration remaining: 1020
6/24 20:03:39 (678.0) (30966): Scheduling another attempt to reconnect in 64 seconds
6/24 20:04:43 (678.0) (30966): Attempting to reconnect to starter < 165.134.71.189:3197>
6/24 20:05:13 (678.0) (30966): select returns 0, connect failed
6/24 20:05:13 (678.0) (30966): Will keep trying for 30 seconds...
6/24 20:05:14 (678.0 ) (30966): Connect failed for 31 seconds; returning FALSE
6/24 20:05:14 (678.0) (30966): Attempt to reconnect failed: Failed to connect to starter <165.134.71.189:3197>
6/24 20:05:14 (678.0) (30966): JobLeaseDuration remaining: 925
6/24 20:05:14 (678.0) (30966): Scheduling another attempt to reconnect in 128 seconds
6/24 20:07:22 (678.0) (30966): Attempting to reconnect to starter < 165.134.71.189:3197>
6/24 20:07:52 (678.0) (30966): select returns 0, connect failed
6/24 20:07:52 (678.0) (30966): Will keep trying for 30 seconds...
6/24 20:07:53 (678.0 ) (30966): Connect failed for 31 seconds; returning FALSE
6/24 20:07:53 (678.0) (30966): Attempt to reconnect failed: Failed to connect to starter <165.134.71.189:3197>
6/24 20:07:53 (678.0) (30966): JobLeaseDuration remaining: 766
6/24 20:07:53 (678.0) (30966): Scheduling another attempt to reconnect in 256 seconds
6/24 20:12:09 (678.0) (30966): Attempting to reconnect to starter < 165.134.71.189:3197>
6/24 20:12:39 (678.0) (30966): select returns 0, connect failed
6/24 20:12:39 (678.0) (30966): Will keep trying for 30 seconds...
6/24 20:12:40 (678.0 ) (30966): Connect failed for 31 seconds; returning FALSE
6/24 20:12:40 (678.0) (30966): Attempt to reconnect failed: Failed to connect to starter <165.134.71.189:3197>
6/24 20:12:40 (678.0) (30966): JobLeaseDuration remaining: 479
6/24 20:12:40 (678.0) (30966): Scheduling another attempt to reconnect in 300 seconds
6/24 20:17:40 (678.0) (30966): Attempting to reconnect to starter < 165.134.71.189:3197>
6/24 20:18:10 (678.0) (30966): select returns 0, connect failed
6/24 20:18:10 (678.0) (30966): Will keep trying for 30 seconds...
6/24 20:18:11 (678.0 ) (30966): Connect failed for 31 seconds; returning FALSE
6/24 20:18:11 (678.0) (30966): Attempt to reconnect failed: Failed to connect to starter <165.134.71.189:3197>
6/24 20:18:11 (678.0) (30966): JobLeaseDuration remaining: 148
6/24 20:18:11 (678.0) (30966): Scheduling another attempt to reconnect in 148 seconds
6/24 20:20:39 (678.0) (30966): Attempting to reconnect to starter < 165.134.71.189:3197>
6/24 20:21:09 (678.0) (30966): select returns 0, connect failed
6/24 20:21:09 (678.0) (30966): Will keep trying for 30 seconds...
6/24 20:21:10 (678.0 ) (30966): Connect failed for 31 seconds; returning FALSE
6/24 20:21:10 (678.0) (30966): Attempt to reconnect failed: Failed to connect to starter <165.134.71.189:3197>
6/24 20:21:10 (678.0) (30966): JobLeaseDuration remaining: EXPIRED!
6/24 20:21:10 (678.0) (30966): Reconnect FAILED: Job disconnected too long: JobLeaseDuration (1200 seconds) expired
6/24 20:21:10 (678.0) (30966): **** condor_shadow (condor_SHADOW) EXITING WITH STATUS 107
6/24 21:44:24 ******************************************************
6/24 21:44:24 ** condor_shadow (CONDOR_SHADOW) STARTING UP
6/24 21:44:24 ** /usr/local/condor/sbin/condor_shadow
6/24 21:44:24 ** $CondorVersion: 6.8.0 Jul 19 2006 $
6/24 21:44:24 ** $CondorPlatform: I386-LINUX_RHEL3 $
6/24 21:44:24 ** PID = 31565
6/24 21:44:24 ** Log last touched 6/24 20:21:10
6/24 21:44:24 ******************************************************
6/24 21:44:24 Using config source: /home/condor/condor_config
6/24 21:44:24 Using local config sources:
6/24 21:44:24    /home/condor/condor_config.local
6/24 21:44:24 DaemonCore: Command Socket at < 165.134.74.162:45094>
6/24 21:44:24 Initializing a VANILLA shadow for job 678.0
6/24 21:44:24 (678.0) (31565): Request to run on <165.134.74.172:1030> was ACCEPTED
6/24 21:44:25 (678.0) (31565): About to decode condor_sysnum
6/24 21:44:25 (678.0) (31565): Got request for syscall get_job_info (-63)
6/24 21:44:25 (678.0) (31565):  rval = 0, errno = 0
6/24 21:44:25 (678.0) (31565): About to decode condor_sysnum
6/24 21:44:25 (678.0) (31565): Got request for syscall register_starter_info (-77)
6/24 21:44:25 (678.0) (31565):   StarterIpAddr = <165.134.74.172:4839>
6/24 21:44:25 ( 678.0) (31565):   UidDomain = arabgol
6/24 21:44:25 (678.0) (31565):   FileSystemDomain = arabgol
6/24 21:44:25 (678.0) (31565):   Machine = arabgol
6/24 21:44:25 (678.0) (31565):   Arch = INTEL
6/24 21:44:25 ( 678.0) (31565):   OpSys = WINNT51
6/24 21:44:25 (678.0) (31565):   CondorVersion = $CondorVersion: 6.8.0 Jul 19 2006 $
6/24 21:44:25 (678.0) (31565):   HasReconnect = TRUE
6/24 21:44:25 (678.0) (31565):  rval = 0, errno = 0
6/24 21:44:27 (678.0) (31565): About to decode condor_sysnum
6/24 21:44:27 (678.0) (31565): Got request for syscall begin_execution (-78)
6/24 21:44:27 (678.0) (31565):  rval = 0, errno = 0
6/24 23:35:22 (670.0 ) (17628): About to decode condor_sysnum
6/24 23:35:22 (670.0) (17628): condor_read(): recv() returned -1, errno = 104, assuming failure.
6/24 23:35:22 (670.0) (17628): Can no longer talk to condor_starter < 165.134.74.218:1030>
6/24 23:35:22 (670.0) (17628): Trying to reconnect to disconnected job
6/24 23:35:22 (670.0) (17628): LastJobLeaseRenewal: 1182707132 Sun Jun 24 22:15:32 2007
6/24 23:35:22 (670.0) (17628): JobLeaseDuration: 14600 seconds
6/24 23:35:22 (670.0) (17628): JobLeaseDuration remaining: 9810
6/24 23:35:22 (670.0) (17628): Attempting to reconnect to starter <165.134.74.218:2520>
6/24 23:35:52 ( 670.0) (17628): select returns 0, connect failed
6/24 23:35:52 (670.0) (17628): Will keep trying for 30 seconds...
6/24 23:35:53 (670.0) (17628): Connect failed for 31 seconds; returning FALSE
6/24 23:35:53 (670.0 ) (17628): Attempt to reconnect failed: Failed to connect to starter <165.134.74.218:2520>
6/24 23:35:53 (670.0) (17628): JobLeaseDuration remaining: 14569
6/24 23:35:53 ( 670.0) (17628): Scheduling another attempt to reconnect in 8 seconds
6/24 23:36:01 (670.0) (17628): Attempting to reconnect to starter <165.134.74.218:2520>
6/24 23:36:31 ( 670.0) (17628): select returns 0, connect failed
6/24 23:36:31 (670.0) (17628): Will keep trying for 30 seconds...
6/24 23:36:32 (670.0) (17628): Connect failed for 31 seconds; returning FALSE
6/24 23:36:32 (670.0 ) (17628): Attempt to reconnect failed: Failed to connect to starter <165.134.74.218:2520>
6/24 23:36:32 (670.0) (17628): JobLeaseDuration remaining: 14530
6/24 23:36:32 ( 670.0) (17628): Scheduling another attempt to reconnect in 16 seconds
6/24 23:36:48 (670.0) (17628): Attempting to reconnect to starter <165.134.74.218:2520>
6/24 23:37:18 ( 670.0) (17628): select returns 0, connect failed
6/24 23:37:18 (670.0) (17628): Will keep trying for 30 seconds...
6/24 23:37:19 (670.0) (17628): Connect failed for 31 seconds; returning FALSE
6/24 23:37:19 (670.0 ) (17628): Attempt to reconnect failed: Failed to connect to starter <165.134.74.218:2520>
6/24 23:37:19 (670.0) (17628): JobLeaseDuration remaining: 14483
6/24 23:37:19 ( 670.0) (17628): Scheduling another attempt to reconnect in 32 seconds
6/24 23:37:51 (670.0) (17628): Attempting to reconnect to starter <165.134.74.218:2520>
6/24 23:38:21 ( 670.0) (17628): select returns 0, connect failed
6/24 23:38:21 (670.0) (17628): Will keep trying for 30 seconds...
6/24 23:38:22 (670.0) (17628): Connect failed for 31 seconds; returning FALSE
6/24 23:38:22 (670.0 ) (17628): Attempt to reconnect failed: Failed to connect to starter <165.134.74.218:2520>
6/24 23:38:22 (670.0) (17628): JobLeaseDuration remaining: 14420
6/24 23:38:22 ( 670.0) (17628): Scheduling another attempt to reconnect in 64 seconds
6/24 23:39:26 (670.0) (17628): Attempting to reconnect to starter <165.134.74.218:2520>
6/24 23:39:56 ( 670.0) (17628): select returns 0, connect failed
6/24 23:39:56 (670.0) (17628): Will keep trying for 30 seconds...
6/24 23:39:57 (670.0) (17628): Connect failed for 31 seconds; returning FALSE
6/24 23:39:57 (670.0 ) (17628): Attempt to reconnect failed: Failed to connect to starter <165.134.74.218:2520>
6/24 23:39:57 (670.0) (17628): JobLeaseDuration remaining: 14325
6/24 23:39:57 ( 670.0) (17628): Scheduling another attempt to reconnect in 128 seconds
6/24 23:42:05 (670.0) (17628): Attempting to reconnect to starter <165.134.74.218:2520>
6/24 23:42:35 ( 670.0) (17628): select returns 0, connect failed
6/24 23:42:35 (670.0) (17628): Will keep trying for 30 seconds...
6/24 23:42:36 (670.0) (17628): Connect failed for 31 seconds; returning FALSE
6/24 23:42:36 (670.0 ) (17628): Attempt to reconnect failed: Failed to connect to starter <165.134.74.218:2520>
6/24 23:42:36 (670.0) (17628): JobLeaseDuration remaining: 14166
6/24 23:42:36 ( 670.0) (17628): Scheduling another attempt to reconnect in 256 seconds
6/24 23:46:52 (670.0) (17628): Attempting to reconnect to starter <165.134.74.218:2520>
6/24 23:47:22 ( 670.0) (17628): select returns 0, connect failed
6/24 23:47:22 (670.0) (17628): Will keep trying for 30 seconds...
6/24 23:47:23 (670.0) (17628): Connect failed for 31 seconds; returning FALSE
6/24 23:47:23 (670.0 ) (17628): Attempt to reconnect failed: Failed to connect to starter <165.134.74.218:2520>
6/24 23:47:23 (670.0) (17628): JobLeaseDuration remaining: 13879
6/24 23:47:23 ( 670.0) (17628): Scheduling another attempt to reconnect in 300 seconds
6/24 23:52:23 (670.0) (17628): Attempting to reconnect to starter <165.134.74.218:2520>
6/24 23:52:53 ( 670.0) (17628): select returns 0, connect failed
6/24 23:52:53 (670.0) (17628): Will keep trying for 30 seconds...
6/24 23:52:54 (670.0) (17628): Connect failed for 31 seconds; returning FALSE
6/24 23:52:54 (670.0 ) (17628): Attempt to reconnect failed: Failed to connect to starter <165.134.74.218:2520>
6/24 23:52:54 (670.0) (17628): JobLeaseDuration remaining: 13548
6/24 23:52:54 ( 670.0) (17628): Scheduling another attempt to reconnect in 300 seconds
6/24 23:57:54 (670.0) (17628): Attempting to reconnect to starter <165.134.74.218:2520>
6/24 23:58:24 ( 670.0) (17628): select returns 0, connect failed
6/24 23:58:24 (670.0) (17628): Will keep trying for 30 seconds...
6/24 23:58:25 (670.0) (17628): Connect failed for 31 seconds; returning FALSE
6/24 23:58:25 (670.0 ) (17628): Attempt to reconnect failed: Failed to connect to starter <165.134.74.218:2520>
6/24 23:58:25 (670.0) (17628): JobLeaseDuration remaining: 13217
6/24 23:58:25 ( 670.0) (17628): Scheduling another attempt to reconnect in 300 seconds