[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] ERROR starting jobs: Jobs get evicted fror unknown reason (108)



Hi,

randomly many of our submitted jobs get immediatly evicted when 
started. I have no idea what's going on, because one log-files says 
"Unknown Reason" All other logfiles contain neither warnings nor errors.
The current behaviour of condor (6.8.0 on suse linux 9 and 10) makes it 
completely unusable, because some jobs take 20-30 negotiation cycles until 
they really start running. I also tried to switch on more log-output, but 
also this output does not contain any information which gives a hint why 
the jobs are evicted.

Any help is welcome,
Thomas

-------------------------------------

Part of the Setup:
WANT_SUSPEND      = False
WANT_VACATE       = False
START     = True
SUSPEND = False
CONTINUE = True
PREEMPT= False
CLAIM_WORKLIFE    = 0
MaxJobRetirementTime = 0
KILL = False
NEGOTIATOR_PRE_JOB_RANK = 0
NEGOTIATOR_POST_JOB_RANK = 0
PREEMPTION_REQUIREMENTS = False
PREEMPTION_RANK = 0

ShadowLog:
8/15 16:46:33 ******************************************************
8/15 16:46:33 ** condor_shadow (CONDOR_SHADOW) STARTING UP
8/15 16:46:33 ** /home/condor/condor-6.8.0/sbin/condor_shadow
8/15 16:46:33 ** $CondorVersion: 6.8.0 Jul 19 2006 $
8/15 16:46:33 ** $CondorPlatform: X86_64-LINUX_RHEL3 $
8/15 16:46:33 ** PID = 27459
8/15 16:46:33 ** Log last touched 8/15 16:46:31
8/15 16:46:33 ******************************************************
8/15 16:46:33 Using config source: /home/condor/condor_config
8/15 16:46:33 Using local config sources: 
8/15 16:46:33    /home/condor/hosts/dc08/condor_config.local
8/15 16:46:33 DaemonCore: Command Socket at <132.187.*.*:56626>
8/15 16:46:33 Initializing a VANILLA shadow for job 3105.0
8/15 16:46:33 (3105.0) (27459): Request to run on <132.187.*.*:58903> 
was REFUSED
8/15 16:46:33 (3105.0) (27459): Job 3105.0 is being evicted
8/15 16:46:33 (3105.0) (27459): logEvictEvent with unknown reason (108), 
aborting
8/15 16:46:33 (3105.0) (27459): **** condor_shadow (condor_SHADOW) EXITING 
WITH STATUS 108

NegotiatorLog:
8/15 16:43:28     Request 03105.00000:
8/15 16:43:28       Matched 3105.0 tbretz@xxxxxxxxxxxxxxxxxxxxxx 
<132.187.47.28:52515> preempting none <132.187.47.22:58903> 
vm2@xxxxxxxxxxxxxxxxxxxxxxxxxxx
8/15 16:43:28       Successfully matched with 
vm2@xxxxxxxxxxxxxxxxxxxxxxxxxxx