[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] startd exiting because of fatal exception



Since upgrading to Condor 7.2.2, I'm getting bombarded by messages
containing

Can't find WANT_SUSPEND in internal ClassAd
...
startd exiting because of fatal exception.

(I don't know if the two messages are related.)

Web search on both these messages didn't offer any light.

Can anyone suggest how I can fix this?

And will the users' jobs still run?

Hardware is X86_64, USB keyboards, condor master is running Red Hat 5.1,
others are running SLED 10.

tia, Dick

5/11 09:48:01 slot1: Received match <139.166.250.77:48837>#1242031390#1#...
5/11 09:48:01 slot1: State change: match notification protocol successful
5/11 09:48:01 slot1: Changing state: Unclaimed -> Matched
5/11 09:48:01 slot1: Request accepted.
5/11 09:48:01 slot1: Remote owner is nice-user.gs7@xxxxxxxxxxxxxxx
5/11 09:48:01 slot1: State change: claiming protocol successful
5/11 09:48:01 slot1: Changing state: Matched -> Claimed
5/11 09:48:02 slot4: Request accepted.
5/11 09:48:02 slot4: Remote owner is nice-user.gs7@xxxxxxxxxxxxxxx
5/11 09:48:02 slot4: State change: claiming protocol successful
5/11 09:48:02 slot4: Changing state: Matched -> Claimed
5/11 09:48:03 slot2: Request accepted.
5/11 09:48:03 slot2: Remote owner is nice-user.gs7@xxxxxxxxxxxxxxx
5/11 09:48:03 slot2: State change: claiming protocol successful
5/11 09:48:03 slot2: Changing state: Matched -> Claimed
5/11 09:48:04 slot3: Got activate_claim request from shadow
(<139.166.250.57:60394>)
5/11 09:48:05 slot3: Remote job ID is 32.27
5/11 09:48:05 slot3: Got universe "STANDARD" (1) from request classad
5/11 09:48:05 slot3: State change: claim-activation protocol successful
5/11 09:48:05 slot3: Changing activity: Idle -> Busy
5/11 09:48:10 ERROR "Can't find WANT_SUSPEND in internal ClassAd" at
line 1226 in file Resource.cpp
5/11 09:48:10 slot1: Changing state and activity: Claimed/Idle ->
Preempting/Killing
5/11 09:48:11 slot1: State change: No preempting claim, returning to owner
5/11 09:48:11 slot1: Changing state and activity: Preempting/Killing ->
Owner/Idle
5/11 09:48:11 slot1: State change: IS_OWNER is false
5/11 09:48:11 slot1: Changing state: Owner -> Unclaimed
5/11 09:48:11 slot1: State change: IS_OWNER is TRUE
5/11 09:48:11 slot1: Changing state: Unclaimed -> Owner
5/11 09:48:11 slot2: Changing state and activity: Claimed/Idle ->
Preempting/Killing
5/11 09:48:11 slot2: State change: No preempting claim, returning to owner
5/11 09:48:11 slot2: Changing state and activity: Preempting/Killing ->
Owner/Idle
5/11 09:48:11 slot2: State change: IS_OWNER is false
5/11 09:48:11 slot2: Changing state: Owner -> Unclaimed
5/11 09:48:11 slot3: Changing state and activity: Claimed/Busy ->
Preempting/Killing
5/11 09:48:11 slot4: Changing state and activity: Claimed/Idle ->
Preempting/Killing
5/11 09:48:11 slot4: State change: No preempting claim, returning to owner
5/11 09:48:11 slot4: Changing state and activity: Preempting/Killing ->
Owner/Idle
5/11 09:48:11 slot4: State change: IS_OWNER is false
5/11 09:48:11 slot4: Changing state: Owner -> Unclaimed
5/11 09:48:11 startd exiting because of fatal exception.
5/11 09:48:21 ***************

--
Richard Gillman
ITC UNIX Systems Group, Maclean Building, Wallingford OX10 8BB
Tel: 01491 - 692 339
Fax: 01491 - 692 446