Hello to all!,
Not trying to be annoying but I really don’t have a clue of how to attack this issue, any ideas are welcome,
Hello to all,
I launched some jobs through my condor pool. I have a mixed farm of windows 2003 and windows XP boxes. The second ones are Virtual machines running on Linux hosts. The jobs I ran last night are still running but I am receiving several e-mail notifications from all the windows XP machines. I launched the jobs from a computer that belonged to another pool using “condor_submit –pool negotiation –name scheduler condor_submission_filename.sub”; The error message is the following:
This is an automated email from the Condor system on machine "vm4-condor-xp.earthdata.com". Do not reply.
"C:\Condor/bin/condor_startd.exe" on "vm4-condor-xp.earthdata.com" exited with status 4.
Condor will automatically restart this process in 10 seconds.
*** Last 20 line(s) of file C:\Condor/log/StartLog:
8/18 18:07:08 slot1: State change: No preempting claim, returning to owner
8/18 18:07:08 slot1: Changing state and activity: Preempting/Vacating -> Owner/Idle
8/18 18:07:08 slot1: State change: IS_OWNER is false
8/18 18:07:08 slot1: Changing state: Owner -> Unclaimed
8/18 18:11:59 slot1: match_info called
8/18 18:11:59 slot1: Received match <10.2.168.99:1578>#1250626520#7#...
8/18 18:11:59 slot1: State change: match notification protocol successful
8/18 18:11:59 slot1: Changing state: Unclaimed -> Matched
8/18 18:11:59 slot1: Request accepted.
8/18 18:11:59 slot1: Remote owner is aalas@xxxxxxxxxxxxx
8/18 18:11:59 slot1: State change: claiming protocol successful
8/18 18:11:59 slot1: Changing state: Matched -> Claimed
8/18 18:11:59 ERROR "Can't find WANT_SUSPEND in internal ClassAd" at line 1226 in file..\src\condor_startd.V6\Resource.cpp
8/18 18:11:59 slot1: Changing state and activity: Claimed/Idle -> Preempting/Killing
8/18 18:11:59 slot1: State change: No preempting claim, returning to owner
8/18 18:11:59 slot1: Changing state and activity: Preempting/Killing -> Owner/Idle
8/18 18:11:59 slot1: State change: IS_OWNER is false
8/18 18:11:59 slot1: Changing state: Owner -> Unclaimed
8/18 18:11:59 slot2: Changing state and activity: Claimed/Busy -> Preempting/Killing
8/18 18:11:59 startd exiting because of fatal exception.
*** End of file StartLog
I am not an expert on condor so I don’t know how to interpret this error message? Any ideas?
Thanks in advance for your help,