[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Jobs remaining Idle



Shaun,

Hey buddy, I don't know if this will help you but "failing to connect" could be 
the result of your firewall not configured properly. I had a little run in with 
this problem when I was getting Condor to work.

Cheers

Danny

Quoting "Shaun J. O'Callaghan" <Shaun.OCallaghan@xxxxxxxxxxxx>:

> Hi there,
> 
> Firstly, apologies if this has been dealt with in this list already.
> I've searched through this list, and the docs, and don't seem to be able
> to find an answer.
> 
> I'm running a test Condor pool at the moment.  I have a Windows XP
> machine (the master server) and a Windows Server 2003 machine (the only
> other machine in the pool).
> 
> I've written a test application, a 'hello world' app, in C just to
> demonstrate that jobs actually get executed and run ok.  However, the
> jobs are queued and then appear to run briefly before entering the
> "Idle" state which is where they stay.  I submit the job from the
> Windows Server 2003 machine to the pool.
> 
> The submit file is as follows:
> 
> ---
> executable = 	condortestapp.exe
> universe =	vanilla
> Requirements = (OpSys == "WINNT50") || (OpSys == "WINNT51") || (OpSys ==
> "WINNT52")
> error = 	error.output
> output =	out.output
> 
> queue
> 
> ---
> 
> Negotiator.log has the following line:
> 
> 5/22 16:42:32 DC_AUTHENTICATE: attempt to open invalid session
> GEOG41:2204:1148048993:2, failing.
> 
> ---
> 
> CollectorLog.log has the following:
> 
> 5/22 16:42:08 (Sent 7 ads in response to query)
> 5/22 16:42:08 Got QUERY_STARTD_PVT_ADS
> 5/22 16:42:08 (Sent 2 ads in response to query)
> 5/22 16:42:32 SubmittorAd  : Inserting ** "<
> Administrator@xxxxxxxxxxxxxxxxxx , xxx.xxx.xxx.xxx >"
> 5/22 16:42:32 stats: Inserting new hashent for
> 'Submittor':'Administrator@xxxxxxxxxxxxxxxxxx:' xxx.xxx.xxx.xxx'
> 5/22 16:42:49 Got QUERY_SCHEDD_ADS
> 5/22 16:42:49 (Sent 1 ads in response to query)
> 5/22 16:46:44 Housekeeper:  Ready to clean old ads
> 5/22 16:46:44 	Cleaning StartdAds ...
> 5/22 16:46:44 	Cleaning StartdPrivateAds ...
> 5/22 16:46:44 	Cleaning ScheddAds ...
> 5/22 16:46:44 	Cleaning SubmittorAds ...
> 5/22 16:46:44 	Cleaning LicenseAds ...
> 5/22 16:46:44 	Cleaning MasterAds ...
> 5/22 16:46:44 	Cleaning CkptServerAds ...
> 5/22 16:46:44 	Cleaning CollectorAds ...
> 5/22 16:46:44 	Cleaning StorageAds ...
> 5/22 16:46:44 Housekeeper:  Done cleaning
> 5/22 16:46:48 Can't connect to < xxx.xxx.xxx.xxx:9618>:0, errno = 10060
> 5/22 16:46:48 Will keep trying for 10 seconds...
> 5/22 16:46:57 Connect failed for 10 seconds; returning FALSE
> 5/22 16:46:57 ERROR:
> SECMAN:2003:TCP connection to <xxx.xxx.xxx.xxx:9618> failed
> 
> 5/22 16:46:57 Can't send UPDATE_COLLECTOR_AD to collector
> (condor.cs.wisc.edu): Failed to send UDP update command to collector
> 5/22 16:47:09 (Sent 8 ads in response to query)
> 5/22 16:47:09 Got QUERY_STARTD_PVT_ADS
> 5/22 16:47:09 (Sent 2 ads in response to query)
> 
> 
> Condor_q -analyze gives the following output from the Windows Server
> 2003 machine:
> 
> 011.000:  Run analysis summary.  Of 2 machines,
>       0 are rejected by your job's requirements
>       0 reject your job because of their own requirements
>       0 match, but are serving users with a better priority in the pool
>       2 match, match, but reject the job for unknown reasons
>       0 match, but will not currently preempt their existing job
>       0 are available to run your job
>         Last successful match: Mon May 22 16:47:10 2006
> 
> 1 jobs; 1 idle, 0 running, 0 held
> 
> ----
> 
> 
> Condor_q -global gives the following output from the Windows XP machine
> (central server)
> 
> ---
> 
> -- Failed to fetch ads from: <xxx.xxx.xxx.xxx:12566> :
> internaldomain.com (IP of Windows Server 2003)
> 
> 
> 
> If anybody can shed any light on why these jobs are remaining idle,
> which I'm sure is a pretty straightforward error I just can't seem to
> put my finger on it, that'd be great.
> 
> Thanks in advance,
> 
> Shaun James O'Callaghan
> 
> 
> 
> 
> 
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>