Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Request to run on <...:...> was REFUSED

Date: Sat, 09 Sep 2006 16:24:18 +0200
From: Horvátth Szabolcs <szabolcs@xxxxxxxxxxxxx>
Subject: [Condor-users] Request to run on <...:...> was REFUSED

Hi,

I found strange error messages in the logs while inspecting the reasonof some mysterious job evictionsthat (seems to) happen right when the job starts. In the shadow log Ifound the following:


9/9 15:21:54 Initializing a VANILLA shadow for job 165980.0

9/9 15:21:54 (165980.0) (5720): Request to run on <192.168.0.105:1040>was REFUSED

9/9 15:21:54 (165980.0) (5720): Job 165980.0 is being evicted

9/9 15:21:54 (165980.0) (5720): logEvictEvent with unknown reason (108),aborting9/9 15:21:54 (165980.0) (5720): **** condor_shadow (condor_SHADOW)EXITING WITH STATUS 108

9/9 15:22:09 ******************************************************


And in the starter log I found this:

9/9 15:22:09 ******************************************************
9/9 15:22:09 Using config source: C:\Condor\condor_config
9/9 15:22:09 Using local config sources:
9/9 15:22:09    C:\Condor/condor_config.local
9/9 15:22:09 DaemonCore: Command Socket at <192.168.0.105:1125>
9/9 15:22:09 Setting resource limits not implemented!
9/9 15:22:09 Communicating with shadow <192.168.0.50:4640>
9/9 15:22:09 Submitting machine is "gadget.digicpictures.local"
9/9 15:22:09 Job has WantIOProxy=true
9/9 15:22:09 Initialized IO Proxy.
9/9 15:22:09 File transfer completed successfully.
9/9 15:22:10 Starting a VANILLA universe job with ID: 165980.0
9/9 15:22:10 IWD: x:/work/condor\dir_255476
9/9 15:22:10 Output file: x:/work/condor\dir_255476\_condor_stdout
9/9 15:22:10 Error file: x:/work/condor\dir_255476\_condor_stderr
9/9 15:22:11 Renice expr "0" evaluated to 0

9/9 15:22:11 About to exec c:\tcl\bin\tclsh.exe//Sv_project1/projects/_extensions/Condor/render_mentalray_greedy.tcl3.4x:/temp/56/movie_56/shots_3d/shots/ke_020/KE_020_tomeg_block_00/1157804549/mi/tomeg_block_00.0082.miR:/56/movie_56/shots_3d/shots/ke_020/frames/KE_020_tomeg_block_00/tomeg_block_00.0082.rgbrender0027.digicpictures.local

9/9 15:22:11 Create_Process succeeded, pid=513524
9/9 15:26:31 IOProxy: accepting connection from 192.168.0.105

9/9 15:26:31 condor_read(): recv() returned -1, errno = 10054, assumingfailure.

9/9 15:26:31 IOProxyHandler: closing connection to 192.168.0.105
9/9 15:26:51 IOProxy: accepting connection from 192.168.0.105

9/9 15:26:51 condor_read(): recv() returned -1, errno = 10054, assumingfailure.

9/9 15:55:31 Process exited, pid=513524, status=0
9/9 15:55:35 Got SIGQUIT.  Performing fast shutdown.
9/9 15:55:35 ShutdownFast all jobs.
9/9 15:55:35 **** condor_starter (condor_STARTER) EXITING WITH STATUS 0
9/9 15:55:48 ******************************************************


I also found some information about this error in the archives of this list
(https://lists.cs.wisc.edu/archive/condor-users/2005-February/msg00260.shtml)
but could not find the solution nor the source of the problem.

Its a bit confusing that the eviction does happen instantly but theprocess looks like asit was completed, although the periodically called chirp process couldnot connect to the scheduler.


WinXP, Condor 6.8.0 but 6.7.x had the same behaviour.

Cheers,
Szabolcs

Prev by Date: [Condor-users] Permission Denied Problem
Next by Date: [Condor-users] Weird problem (with condor-6.8.0)
Previous by thread: [Condor-users] Permission Denied Problem
Next by thread: [Condor-users] Weird problem (with condor-6.8.0)
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

[Condor-users] Request to run on <...:...> was REFUSED