Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Schedd keeps dying in 7.0.3 under Solaris!
- Date: Thu, 10 Jul 2008 16:03:58 +0100
- From: Mark Calleja <M.Calleja@xxxxxxxxxxxxxxx>
- Subject: [Condor-users] Schedd keeps dying in 7.0.3 under Solaris!
Hi chaps,
We've hit a problem and we'd urgently like to hear of any solution. We
have a Solaris box that acts as a submit host and its schedd dies every
few minutes; the following snippet from the SchedLog is typical of the
symptom:
7/10 10:50:28 (pid:16042) Calling Handler <to startd <172.24.89.88:9108>>
7/10 10:50:28 (pid:16042) ERROR "Assertion ERROR on
(mrec->request_claim_sock == sock)" at line 1361 in file
dedicated_scheduler.C
7/10 10:50:43 (pid:16188)
******************************************************
7/10 10:50:43 (pid:16188) ** condor_schedd (CONDOR_SCHEDD) STARTING UP
7/10 10:50:43 (pid:16188) ** /prg/condor/sbin/condor_schedd
7/10 10:50:43 (pid:16188) ** $CondorVersion: 7.0.3 Jun 20 2008 BuildID:
91405 $
7/10 10:50:43 (pid:16188) ** $CondorPlatform: SUN4X-SOLARIS29 $
7/10 10:50:43 (pid:16188) ** PID = 16188
7/10 10:50:43 (pid:16188) ** Log last touched 7/10 10:50:28
7/10 10:50:43 (pid:16188)
******************************************************
The OS details are:
% uname -a
SunOS <hostname> 5.9 Generic_112233-10 sun4u sparc SUNW,Sun-Fire-880
This seems related to the problem mentioned here:
http://www.cs.wisc.edu/condor/ligo-tickets/2237.html
Was that problem resolved? It's not apparent from the link. For now
we're downgrading that box to 6.8.8, but that can only be a short term
solution
Any clues/fixes out there?
Best regards,
Mark