[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Strange schedd crash (exit status 44)



My euro cent :

my jobs where canceled just because I forgot to add the #! line as the first line of my bash script

#!/bin/bash

like we use #!/usr/bin/perl


Cheers,


	Alain
-----------------------

Hoping this is as simple than that

Ian Chesal wrote:
Okay. Something is definitly wrong here. Shadows are dying and they're
taking out the schedd with it. That's not good. Can't anyone offer any
insight?

Thanks!
Ian


----

his is an automated email from the Condor system on machine
"TTC-MDEHKORD.altera.priv.altera.com".  Do not reply.

"d:\abc\condor/bin/condor_schedd.exe" on
"TTC-MDEHKORD.altera.priv.altera.com" exited with status 44.

Condor will automatically restart this process in 10 seconds.

*** Last 100 line(s) of file SchedLog:
11/24 14:31:04 ERROR: Shadow exited with job exception code!
11/24 14:31:06 Started shadow for job 15.1 on "<137.57.176.30:4411>",
(shadow pid = 2260)
11/24 14:31:06 DaemonCore: Command received via UDP from host
<137.57.142.168:4226>
11/24 14:31:06 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:31:06 ERROR: Shadow exited with job exception code!
11/24 14:31:08 Started shadow for job 15.0 on "<137.57.176.30:4411>",
(shadow pid = 2484)
11/24 14:31:08 DaemonCore: Command received via UDP from host
<137.57.142.168:4230>
11/24 14:31:08 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:31:08 ERROR: Shadow exited with job exception code!
11/24 14:31:08 Match for cluster 15 has had 5 shadow exceptions,
relinquishing.
11/24 14:31:08 Sent RELEASE_CLAIM to startd on <137.57.176.30:4411>
11/24 14:31:08 Match record (<137.57.176.30:4411>, 15, 0) deleted
11/24 14:31:10 Started shadow for job 15.1 on "<137.57.176.30:4411>",
(shadow pid = 176)
11/24 14:31:10 Sent ad to 1 collectors for mdehkord@xxxxxxxxxx
11/24 14:31:10 DaemonCore: Command received via UDP from host
<137.57.142.168:4238>
11/24 14:31:10 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:31:10 ERROR: Shadow exited with job exception code!
11/24 14:31:10 Match for cluster 15 has had 5 shadow exceptions,
relinquishing.
11/24 14:31:10 Sent RELEASE_CLAIM to startd on <137.57.176.30:4411>
11/24 14:31:10 Match record (<137.57.176.30:4411>, 15, 1) deleted
11/24 14:31:11 DaemonCore: Command received via TCP from host
<137.57.176.30:2376>
11/24 14:31:11 DaemonCore: received command 443 (VACATE_SERVICE),
calling handler (vacate_service)
11/24 14:31:12 Got VACATE_SERVICE from <137.57.176.30:2376>
11/24 14:31:12 DaemonCore: Command received via TCP from host
<137.57.176.30:2377>
11/24 14:31:12 DaemonCore: received command 443 (VACATE_SERVICE),
calling handler (vacate_service)
11/24 14:31:12 Got VACATE_SERVICE from <137.57.176.30:2377>
11/24 14:32:42 Activity on stashed negotiator socket
11/24 14:32:42 Negotiating for owner: mdehkord@xxxxxxxxxx
11/24 14:32:42 Checking consistency running and runnable jobs
11/24 14:32:42 Tables are consistent
11/24 14:32:42 Out of jobs - 2 jobs matched, 0 jobs idle, flock level =
0
11/24 14:32:42 Sent ad to 1 collectors for mdehkord@xxxxxxxxxx
11/24 14:32:46 Started shadow for job 15.0 on "<137.57.176.30:4411>",
(shadow pid = 840)
11/24 14:32:47 DaemonCore: Command received via UDP from host
<137.57.142.168:4250>
11/24 14:32:47 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:32:47 ERROR: Shadow exited with job exception code!
11/24 14:32:48 Started shadow for job 15.1 on "<137.57.176.30:4411>",
(shadow pid = 3388)
11/24 14:32:49 DaemonCore: Command received via UDP from host
<137.57.142.168:4254>
11/24 14:32:49 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:32:49 ERROR: Shadow exited with job exception code!
11/24 14:32:50 Started shadow for job 15.0 on "<137.57.176.30:4411>",
(shadow pid = 2024)
11/24 14:32:51 DaemonCore: Command received via UDP from host
<137.57.142.168:4258>
11/24 14:32:51 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:32:51 ERROR: Shadow exited with job exception code!
11/24 14:32:52 Started shadow for job 15.1 on "<137.57.176.30:4411>",
(shadow pid = 3784)
11/24 14:32:53 DaemonCore: Command received via UDP from host
<137.57.142.168:4262>
11/24 14:32:53 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:32:53 ERROR: Shadow exited with job exception code!
11/24 14:32:54 Started shadow for job 15.0 on "<137.57.176.30:4411>",
(shadow pid = 3364)
11/24 14:32:55 DaemonCore: Command received via UDP from host
<137.57.142.168:4266>
11/24 14:32:55 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:32:55 ERROR: Shadow exited with job exception code!
11/24 14:32:56 Started shadow for job 15.1 on "<137.57.176.30:4411>",
(shadow pid = 3028)
11/24 14:32:57 DaemonCore: Command received via UDP from host
<137.57.142.168:4270>
11/24 14:32:57 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:32:57 ERROR: Shadow exited with job exception code!
11/24 14:32:58 Started shadow for job 15.0 on "<137.57.176.30:4411>",
(shadow pid = 3224)
11/24 14:32:59 DaemonCore: Command received via UDP from host
<137.57.142.168:4274>
11/24 14:32:59 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:32:59 ERROR: Shadow exited with job exception code!
11/24 14:33:00 Started shadow for job 15.1 on "<137.57.176.30:4411>",
(shadow pid = 908)
11/24 14:33:01 DaemonCore: Command received via UDP from host
<137.57.142.168:4278>
11/24 14:33:01 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:33:01 ERROR: Shadow exited with job exception code!
11/24 14:33:02 Started shadow for job 15.0 on "<137.57.176.30:4411>",
(shadow pid = 756)
11/24 14:33:03 DaemonCore: Command received via UDP from host
<137.57.142.168:4282>
11/24 14:33:03 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:33:03 ERROR: Shadow exited with job exception code!
11/24 14:33:03 Match for cluster 15 has had 5 shadow exceptions,
relinquishing.
11/24 14:33:03 Sent RELEASE_CLAIM to startd on <137.57.176.30:4411>
11/24 14:33:03 Match record (<137.57.176.30:4411>, 15, 0) deleted
11/24 14:33:03 DaemonCore: Command received via TCP from host
<137.57.176.30:2392>
11/24 14:33:03 DaemonCore: received command 443 (VACATE_SERVICE),
calling handler (vacate_service)
11/24 14:33:03 Got VACATE_SERVICE from <137.57.176.30:2392>
11/24 14:33:04 Started shadow for job 15.1 on "<137.57.176.30:4411>",
(shadow pid = 3928)
11/24 14:33:04 Sent ad to 1 collectors for mdehkord@xxxxxxxxxx
11/24 14:33:05 DaemonCore: Command received via UDP from host
<137.57.142.168:4290>
11/24 14:33:05 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:33:05 ERROR: Shadow exited with job exception code!
11/24 14:33:05 Match for cluster 15 has had 5 shadow exceptions,
relinquishing.
11/24 14:33:05 Sent RELEASE_CLAIM to startd on <137.57.176.30:4411>
11/24 14:33:05 Match record (<137.57.176.30:4411>, 15, 1) deleted
11/24 14:33:05 DaemonCore: Command received via TCP from host
<137.57.176.30:2393>
11/24 14:33:05 DaemonCore: received command 443 (VACATE_SERVICE),
calling handler (vacate_service)
11/24 14:33:05 Got VACATE_SERVICE from <137.57.176.30:2393>
11/24 14:34:42 Activity on stashed negotiator socket
11/24 14:34:42 Negotiating for owner: mdehkord@xxxxxxxxxx
11/24 14:34:43 Checking consistency running and runnable jobs
11/24 14:34:43 Tables are consistent
11/24 14:34:43 Out of jobs - 2 jobs matched, 0 jobs idle, flock level =
0
11/24 14:34:43 Sent ad to 1 collectors for mdehkord@xxxxxxxxxx
11/24 14:34:48 Started shadow for job 15.0 on "<137.57.176.30:4411>",
(shadow pid = 2188)
11/24 14:34:48 DaemonCore: Command received via UDP from host
<137.57.142.168:4305>
11/24 14:34:48 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:34:48 ERROR: Shadow exited with job exception code!
11/24 14:34:50 Started shadow for job 15.1 on "<137.57.176.30:4411>",
(shadow pid = 2796)
11/24 14:34:50 DaemonCore: Command received via UDP from host
<137.57.142.168:4309>
11/24 14:34:50 DaemonCore: received command 60001 (DC_PROCESSEXIT),
calling handler (HandleProcessExitCommand())

11/24 14:34:50 ERROR: Shadow exited with job exception code!
11/24 14:34:52 Started shadow for job 15.0 on "<137.57.176.30:4411>",
(shadow pid = 712)
*** End of file SchedLog


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Ian Chesal
Sent: November 24, 2004 12:10 PM
To: Condor-Users Mail List
Subject: RE: [Condor-users] Strange schedd crash (exit status 44)


We got the same crash again with schedd on Windows. This is the 6.7.2 branch. Is there something in the output that might tip us off to a problem? It looks like it's dying trying to fork a condor_shadown for a new job in both cases.

Thanks!
Ian

----
This is an automated email from the Condor system on machine "TTC-GQUAN3.altera.priv.altera.com". Do not reply.


"d:\abc\condor/bin/condor_schedd.exe" on "TTC-GQUAN3.altera.priv.altera.com" exited with status 44.
Condor will automatically restart this process in 10 seconds.


*** Last 100 line(s) of file SchedLog:
11/24 09:14:42 attempt to add pre-existing match "<137.57.176.183:4197>#1099203124#1706" ignored
11/24 09:14:42 attempt to add pre-existing match "<137.57.176.179:2712>#1099202607#1606" ignored
11/24 09:14:42 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/24 09:14:42 Match record (<137.57.176.180:1047>, 20, 234) deleted
11/24 09:14:49 DaemonCore: Command received via UDP from host <137.57.142.51:1319>
11/24 09:14:49 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/24 09:14:52 Started shadow for job 20.232 on "<137.57.176.180:1047>", (shadow pid = 2152)
11/24 09:14:52 Sent ad to 1 collectors for gquan@xxxxxxxxxx
11/24 09:14:53 DaemonCore: Command received via TCP from host <137.57.176.180:1877>
11/24 09:14:53 DaemonCore: received command 443 (VACATE_SERVICE), calling handler (vacate_service)
11/24 09:14:53 Got VACATE_SERVICE from <137.57.176.180:1877>
11/24 09:14:53 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/24 09:14:53 Match record (<137.57.176.180:1047>, 20, 232) deleted
11/24 09:14:53 DaemonCore: Command received via UDP from host <137.57.142.51:1331>
11/24 09:14:53 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/24 09:14:54 DaemonCore: Command received via UDP from host <137.57.142.51:1332>
11/24 09:14:54 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/24 09:14:54 Scheduler::Relinquish - mrec is NULL, can't relinquish
11/24 09:14:54 Null parameter --- match not deleted
11/24 09:14:56 Started shadow for job 20.233 on "<137.57.176.180:1047>", (shadow pid = 2720)
11/24 09:14:58 Started shadow for job 20.234 on "<137.57.176.180:1047>", (shadow pid = 2100)
11/24 09:14:58 Sent ad to 1 collectors for gquan@xxxxxxxxxx
11/24 09:16:41 Response problem from startd.
11/24 09:16:41 Sent RELEASE_CLAIM to startd on <137.57.176.183:4197>
11/24 09:16:41 Match record (<137.57.176.183:4197>, 20, 235) deleted
11/24 09:16:42 Activity on stashed negotiator socket
11/24 09:16:42 Negotiating for owner: gquan@xxxxxxxxxx
11/24 09:16:42 Checking consistency running and runnable jobs
11/24 09:16:42 Tables are consistent
11/24 09:16:43 Out of servers - 0 jobs matched, 36 jobs idle, 1 jobs rejected
11/24 09:16:43 Response problem from startd.
11/24 09:16:43 Sent RELEASE_CLAIM to startd on <137.57.176.179:2712>
11/24 09:16:43 Match record (<137.57.176.179:2712>, 20, 236) deleted
11/24 09:17:28 Sent ad to 1 collectors for gquan@xxxxxxxxxx
11/24 09:18:43 Activity on stashed negotiator socket
11/24 09:18:43 Negotiating for owner: gquan@xxxxxxxxxx
11/24 09:18:43 Checking consistency running and runnable jobs
11/24 09:18:43 Tables are consistent
11/24 09:18:43 Out of servers - 0 jobs matched, 36 jobs idle, 1 jobs rejected
11/24 09:19:43 DaemonCore: Command received via UDP from host <137.57.142.51:1395>
11/24 09:19:43 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/24 09:19:43 DaemonCore: Command received via UDP from host <137.57.142.51:1398>
11/24 09:19:43 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/24 09:19:45 Started shadow for job 20.232 on "<137.57.176.180:1047>", (shadow pid = 2672)
11/24 09:19:47 Started shadow for job 20.235 on "<137.57.176.180:1047>", (shadow pid = 1448)
11/24 09:19:47 Sent ad to 1 collectors for gquan@xxxxxxxxxx
11/24 09:20:43 Activity on stashed negotiator socket
11/24 09:20:43 Negotiating for owner: gquan@xxxxxxxxxx
11/24 09:20:43 Checking consistency running and runnable jobs
11/24 09:20:43 Tables are consistent
11/24 09:20:44 Out of servers - 4 jobs matched, 30 jobs idle, 1 jobs rejected
11/24 09:20:58 DaemonCore: Command received via UDP from host <137.57.142.51:1427>
11/24 09:20:58 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/24 09:21:01 Started shadow for job 20.236 on "<137.57.176.183:4197>", (shadow pid = 2108)
11/24 09:21:01 Sent ad to 1 collectors for gquan@xxxxxxxxxx
11/24 09:21:07 DaemonCore: Command received via TCP from host <137.57.176.183:2328>
11/24 09:21:07 DaemonCore: received command 443 (VACATE_SERVICE), calling handler (vacate_service)
11/24 09:21:07 Got VACATE_SERVICE from <137.57.176.183:2328>
11/24 09:21:07 Sent RELEASE_CLAIM to startd on <137.57.176.183:4197>
11/24 09:21:07 Match record (<137.57.176.183:4197>, 20, 236) deleted
11/24 09:21:07 DaemonCore: Command received via UDP from host <137.57.142.51:1440>
11/24 09:21:07 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/24 09:21:07 Scheduler::Relinquish - mrec is NULL, can't relinquish
11/24 09:21:07 Null parameter --- match not deleted
11/24 09:21:10 Started shadow for job 20.238 on "<137.57.176.183:4197>", (shadow pid = 2772)
11/24 09:21:10 Sent ad to 1 collectors for gquan@xxxxxxxxxx
11/24 09:22:34 DaemonCore: Command received via UDP from host <137.57.142.51:1462>
11/24 09:22:34 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/24 09:22:37 Started shadow for job 20.236 on "<137.57.176.179:2712>", (shadow pid = 2292)
11/24 09:22:37 Sent ad to 1 collectors for gquan@xxxxxxxxxx
11/24 09:22:42 DaemonCore: Command received via TCP from host <137.57.176.179:2089>
11/24 09:22:42 DaemonCore: received command 443 (VACATE_SERVICE), calling handler (vacate_service)
11/24 09:22:42 Got VACATE_SERVICE from <137.57.176.179:2089>
11/24 09:22:42 Sent RELEASE_CLAIM to startd on <137.57.176.179:2712>
11/24 09:22:42 Match record (<137.57.176.179:2712>, 20, 236) deleted
11/24 09:22:43 DaemonCore: Command received via UDP from host <137.57.142.51:1473>
11/24 09:22:43 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/24 09:22:43 Scheduler::Relinquish - mrec is NULL, can't relinquish
11/24 09:22:43 Null parameter --- match not deleted
11/24 09:22:44 Activity on stashed negotiator socket
11/24 09:22:44 Negotiating for owner: gquan@xxxxxxxxxx
11/24 09:22:44 Checking consistency running and runnable jobs
11/24 09:22:45 Tables are consistent
11/24 09:22:45 Out of servers - 3 jobs matched, 29 jobs idle, 1 jobs rejected
11/24 09:22:45 attempt to add pre-existing match "<137.57.176.180:1047>#1100637096#502" ignored
11/24 09:22:45 attempt to add pre-existing match "<137.57.176.180:1047>#1100637096#501" ignored
11/24 09:22:45 attempt to add pre-existing match "<137.57.176.179:2712>#1099202607#1607" ignored
11/24 09:22:45 Started shadow for job 20.239 on "<137.57.176.179:2712>", (shadow pid = 1144)
11/24 09:22:45 Sent ad to 1 collectors for gquan@xxxxxxxxxx
11/24 09:24:36 DaemonCore: Command received via UDP from host <137.57.142.51:1505>
11/24 09:24:36 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/24 09:24:37 DaemonCore: Command received via UDP from host <137.57.142.51:1508>
11/24 09:24:37 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/24 09:24:39 DaemonCore: Command received via TCP from host <137.57.176.180:2306>
11/24 09:24:39 DaemonCore: received command 443 (VACATE_SERVICE), calling handler (vacate_service)
11/24 09:24:39 Got VACATE_SERVICE from <137.57.176.180:2306>
11/24 09:24:39 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/24 09:24:39 Match record (<137.57.176.180:1047>, 20, 236) deleted
11/24 09:24:39 match or classad for job 20.236 was deleted - not forking a shadow
11/24 09:24:39 Started shadow for job 20.237 on "<137.57.176.180:1047>", (shadow pid = 3912)
*** End of file SchedLog




-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Questions about this message or Condor in general?
Email address of the local Condor administrator: swttcabca@xxxxxxxxxx The Official Condor Homepage is http://www.cs.wisc.edu/condor






-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Ian Chesal
Sent: November 23, 2004 2:45 PM
To: Condor-Users Mail List
Subject: [Condor-users] Strange schedd crash (exit status 44)


I get a schedd crash from this users machine every time he

queues up


100 or more jobs. What does exit status 44 indicate?

Thanks!
Ian

-----Original Message-----
From: SYSTEM@xxxxxxxxxx [mailto:SYSTEM@xxxxxxxxxx]
Sent: November 23, 2004 2:32 PM
To: SW TOR Batch System Admins
Subject: [Condor] Problem

This is an automated email from the Condor system on machine "TTC-GQUAN3.altera.priv.altera.com". Do not reply.

"d:\abc\condor/bin/condor_schedd.exe" on "TTC-GQUAN3.altera.priv.altera.com" exited with status 44.
Condor will automatically restart this process in 10 seconds.


*** Last 100 line(s) of file SchedLog:
11/23 14:28:58 attempt to add pre-existing match "<137.57.176.180:1047>#1100637096#282" ignored
11/23 14:28:58 attempt to add pre-existing match "<137.57.176.182:1151>#1099422886#1224" ignored
11/23 14:28:58 attempt to add pre-existing match "<137.57.176.182:1151>#1099422886#1223" ignored
11/23 14:28:58 attempt to add pre-existing match "<137.57.176.183:4197>#1099203124#1580" ignored
11/23 14:28:58 attempt to add pre-existing match "<137.57.176.183:4197>#1099203124#1579" ignored
11/23 14:28:58 attempt to add pre-existing match "<137.57.176.185:1407>#1099202749#1981" ignored
11/23 14:28:58 attempt to add pre-existing match "<137.57.176.185:1407>#1099202749#1982" ignored
11/23 14:28:58 attempt to add pre-existing match "<137.57.176.177:1213>#1100703290#277" ignored
11/23 14:28:58 attempt to add pre-existing match "<137.57.176.186:2147>#1099203682#1256" ignored
11/23 14:28:58 attempt to add pre-existing match "<137.57.176.177:1213>#1100703290#276" ignored
11/23 14:28:58 attempt to add pre-existing match "<137.57.176.186:2147>#1099203682#1257" ignored
11/23 14:28:58 attempt to add pre-existing match "<137.57.176.178:3591>#1099202664#1406" ignored
11/23 14:28:58 attempt to add pre-existing match "<137.57.176.179:2712>#1099202607#1468" ignored
11/23 14:28:59 attempt to add pre-existing match "<137.57.176.179:2712>#1099202607#1467" ignored
11/23 14:29:31 DaemonCore: Command received via UDP from host <137.57.142.51:4119>
11/23 14:29:31 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/23 14:29:36 Started shadow for job 19.130 on "<137.57.176.179:2712>", (shadow pid = 472)
11/23 14:29:36 Sent ad to 1 collectors for gquan@xxxxxxxxxx
11/23 14:29:36 timed out requesting claim from <137.57.176.180:1047>
11/23 14:29:36 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/23 14:29:36 timed out requesting claim from <137.57.176.180:1047>
11/23 14:29:36 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/23 14:29:40 DaemonCore: Command received via TCP from host <137.57.176.179:4906>
11/23 14:29:40 DaemonCore: received command 443 (VACATE_SERVICE), calling handler (vacate_service)
11/23 14:29:40 Got VACATE_SERVICE from <137.57.176.179:4906>
11/23 14:29:40 Sent RELEASE_CLAIM to startd on <137.57.176.179:2712>
11/23 14:29:40 Match record (<137.57.176.179:2712>, 19, 130) deleted
11/23 14:29:40 DaemonCore: Command received via UDP from host <137.57.142.51:4133>
11/23 14:29:40 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/23 14:29:40 Scheduler::Relinquish - mrec is NULL, can't

relinquish


11/23 14:29:40 Null parameter --- match not deleted
11/23 14:29:44 Started shadow for job 19.159 on "<137.57.176.179:2712>", (shadow pid = 2972)
11/23 14:29:44 Sent ad to 1 collectors for gquan@xxxxxxxxxx
11/23 14:29:45 timed out requesting claim from <137.57.176.180:1047>
11/23 14:29:45 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/23 14:29:45 timed out requesting claim from <137.57.176.180:1047>
11/23 14:29:45 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/23 14:30:02 DaemonCore: Command received via UDP from host <137.57.142.51:4146>
11/23 14:30:02 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/23 14:30:05 condor_read(): recv() returned -1, errno = 10054, assuming failure.
11/23 14:30:05 Response problem from startd.
11/23 14:30:05 Sent RELEASE_CLAIM to startd on <137.57.176.182:1151>
11/23 14:30:05 Match record (<137.57.176.182:1151>, 19, 129) deleted
11/23 14:30:07 Started shadow for job 19.130 on "<137.57.176.182:1151>", (shadow pid = 1036)
11/23 14:30:07 Sent ad to 1 collectors for gquan@xxxxxxxxxx
11/23 14:30:07 timed out requesting claim from <137.57.176.180:1047>
11/23 14:30:08 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/23 14:30:08 timed out requesting claim from <137.57.176.180:1047>
11/23 14:30:08 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/23 14:30:13 DaemonCore: Command received via TCP from host <137.57.176.182:4778>
11/23 14:30:13 DaemonCore: received command 443 (VACATE_SERVICE), calling handler (vacate_service)
11/23 14:30:13 Got VACATE_SERVICE from <137.57.176.182:4778>
11/23 14:30:13 Sent RELEASE_CLAIM to startd on <137.57.176.182:1151>
11/23 14:30:13 Match record (<137.57.176.182:1151>, 19, 130) deleted
11/23 14:30:13 DaemonCore: Command received via UDP from host <137.57.142.51:4176>
11/23 14:30:13 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/23 14:30:13 Scheduler::Relinquish - mrec is NULL, can't

relinquish


11/23 14:30:13 Null parameter --- match not deleted
11/23 14:30:17 Started shadow for job 19.133 on "<137.57.176.182:1151>", (shadow pid = 2300)
11/23 14:30:17 Sent ad to 1 collectors for gquan@xxxxxxxxxx
11/23 14:30:17 timed out requesting claim from <137.57.176.180:1047>
11/23 14:30:17 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/23 14:30:17 timed out requesting claim from <137.57.176.180:1047>
11/23 14:30:17 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/23 14:30:42 DaemonCore: Command received via UDP from host <137.57.142.51:4190>
11/23 14:30:42 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/23 14:30:45 Started shadow for job 19.130 on "<137.57.176.180:1047>", (shadow pid = 3624)
11/23 14:30:45 Sent ad to 1 collectors for gquan@xxxxxxxxxx
11/23 14:30:45 timed out requesting claim from <137.57.176.180:1047>
11/23 14:30:46 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/23 14:30:46 timed out requesting claim from <137.57.176.180:1047>
11/23 14:30:46 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/23 14:30:52 DaemonCore: Command received via TCP from host <137.57.176.180:3514>
11/23 14:30:52 DaemonCore: received command 443 (VACATE_SERVICE), calling handler (vacate_service)
11/23 14:30:52 Got VACATE_SERVICE from <137.57.176.180:3514>
11/23 14:30:52 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/23 14:30:52 Match record (<137.57.176.180:1047>, 19, 130) deleted
11/23 14:30:52 DaemonCore: Command received via UDP from host <137.57.142.51:4204>
11/23 14:30:52 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling handler (HandleProcessExitCommand())


11/23 14:30:52 Scheduler::Relinquish - mrec is NULL, can't

relinquish


11/23 14:30:52 Null parameter --- match not deleted
11/23 14:30:55 Response problem from startd.
11/23 14:30:55 Sent RELEASE_CLAIM to startd on <137.57.176.180:1047>
11/23 14:30:55 Match record (<137.57.176.180:1047>, 19, 131) deleted
11/23 14:30:56 Response problem from startd.
11/23 14:30:56 Sent RELEASE_CLAIM to startd on <137.57.176.185:1407>
11/23 14:30:56 Match record (<137.57.176.185:1407>, 19, 151) deleted
11/23 14:30:56 Response problem from startd.
11/23 14:30:56 Sent RELEASE_CLAIM to startd on <137.57.176.183:4197>
11/23 14:30:56 Match record (<137.57.176.183:4197>, 19, 147) deleted
11/23 14:30:56 Response problem from startd.
11/23 14:30:56 Sent RELEASE_CLAIM to startd on <137.57.176.183:4197>
11/23 14:30:56 Match record (<137.57.176.183:4197>, 19, 149) deleted
11/23 14:30:56 Response problem from startd.
11/23 14:30:56 Sent RELEASE_CLAIM to startd on <137.57.176.185:1407>
11/23 14:30:56 Match record (<137.57.176.185:1407>, 19, 150) deleted
11/23 14:30:57 Response problem from startd.
11/23 14:30:57 Sent RELEASE_CLAIM to startd on <137.57.176.186:2147>
11/23 14:30:57 Match record (<137.57.176.186:2147>, 19, 155) deleted
11/23 14:30:57 Started shadow for job 19.130 on "<137.57.176.180:1047>", (shadow pid = 2692)
*** End of file SchedLog




-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Questions about this message or Condor in general?
Email address of the local Condor administrator: swttcabca@xxxxxxxxxx The Official Condor Homepage is http://www.cs.wisc.edu/condor




_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
http://lists.cs.wisc.edu/mailman/listinfo/condor-users


_______________________________________________ Condor-users mailing list Condor-users@xxxxxxxxxxx http://lists.cs.wisc.edu/mailman/listinfo/condor-users



_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
http://lists.cs.wisc.edu/mailman/listinfo/condor-users



-- ------------------------------------------------------------ Dr Alain EMPAIN <alain.empain@xxxxxxxxx> <alain@xxxxxxxxxx> Bioinformatics, Molecular Genetics, Fac. Med. Vet., University of Liège, Belgium Bd de Colonster, B43 B-4000 Liège (Sart-Tilman) WORK: +32 4 366 3821 FAX: +32 4 366 4122 HOME: rue des Martyrs,7 B- 4550 Nandrin +32 85 51 23 41 GSM: +32 497 70 17 64 ------------------------------------------------------------------------------- [ Creative Commons ] Ne pas confondre 'Piraterie' et 'Partage des connaissances' : Faire circuler la connaissance est au coeur même de l'activité de création et d'invention. La connaissance scientifique est basée sur des siècles de partage créatif. 'Du bon usage de la piraterie' F. Latrive (PDF) http://www.freescape.eu.org/piraterie/complet.html -------------------------------------------------------------------------------