[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] ERROR "Assertion ERROR on (result)" at line 319 in file NTreceiver



So this error is now proving to be quite a problem for one large cluster
in my system. Basically all the jobs in this cluster are causing this
assertion in the shadow code when they start to run.

Can someone with condor_shadow code access give me an idea of what might
be causing this assert to get triggered?

- Ian

-----Original Message-----
From: Ian Chesal 
Sent: May 27, 2007 9:29 PM
To: 'Condor-Users Mail List'
Subject: ERROR "Assertion ERROR on (result)" at line 319 in file
NTreceiver

I'm seeing a number of errors like this in on processes in my ShadowLog
file. A complete snippet from the log file looks like this:

5/27 18:18:16 ******************************************************
5/27 18:18:16 ** condor_shadow (CONDOR_SHADOW) STARTING UP
5/27 18:18:16 ** /opt/condor/sbin/condor_shadow
5/27 18:18:16 ** $CondorVersion: 6.8.0 Jul 19 2006 $
5/27 18:18:16 ** $CondorPlatform: I386-LINUX_RHEL3 $ 
5/27 18:18:16 ** PID = 26521
5/27 18:18:16 ** Log last touched 5/27 18:18:16
5/27 18:18:16 ******************************************************
5/27 18:18:16 Using config source:
/opt/condor/configs/condor_config.LINUX
5/27 18:18:16 Using local config sources: 
5/27 18:18:16    /build/condor/condor_config.local.LINUX
5/27 18:18:16 DaemonCore: Command Socket at <137.57.202.107:41873>
5/27 18:18:16 Initializing a VANILLA shadow for job 94812.0
5/27 18:18:16 (94812.0) (26521): Request to run on <137.57.233.187:1106>
was REFUSED
5/27 18:18:16 (94812.0) (26521): Job 94812.0 is being evicted
5/27 18:18:16 (94812.0) (26521): logEvictEvent with unknown reason
(108), aborting
5/27 18:18:16 (94812.0) (26521): **** condor_shadow (condor_SHADOW)
EXITING WITH STATUS 108
5/27 18:18:23 (94673.23) (17666): condor_write(): Socket closed when
trying to write buffer, fd is 7
5/27 18:18:23 (94673.23) (17666): Buf::write(): condor_write() failed
5/27 18:18:23 (94673.23) (17666): ERROR "Assertion ERROR on (result)" at
line 319 in file NTreceivers.C
5/27 18:18:24 passwd_cache::cache_uid(): getpwnam("condor") failed:
Success
5/27 18:18:24 passwd_cache::cache_uid(): getpwnam("condor") failed:
Success


Should I be worrying about the asserition error?

- Ian