[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] jobs won't run: MY.Rank > MY.CurrentRank



Hi,

The changes in the IP addresses need to be reflected
in the HOST_ALLOW_* entries in the condor_config file
on the central manager. The central manager runs the
negotiator and collector daemons, and the collector
will only accept requests from machines listed
in the HOST_ALLOW_* list.

Additionally, on the submission machine, the job log
file and the sched daemon log file may be helpful.

Gabriel



> There have been a number of changes in the ip addresses in the past few
> weeks.
> These changes were made and the latest version of condor installed
> (6.6.10). Then they did accept at least one job before entering the
> unclaimed/idle state. I will try to access the log files on the server
> and try to trace activity for one of these machines. It certainly could
> be related to that (in fact we are suspicious of this network change
> but are not sure how to trace it or fix it ... one option is to
> stop all machines including the master and restart everything).
>
> Bob Orchard
> National Research Council Canada      Conseil national de recherches
> Canada
> Institute for Information Technology  Institut de technologie de
> l'information
> 1200 Montreal Road, Building M-50     M50, 1200 chemin Montréal
> Ottawa, ON, Canada K1A 0R6            Ottawa (Ontario) Canada K1A 0R6
> (613) 993-8557
> (613) 952-0215 Fax / télécopieur
> bob.orchard@xxxxxxxxxxxxxx
> Government of Canada | Gouvernement du Canada
>
>
>
> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx
> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Gabriel Mateescu
> Sent: Wednesday, December 07, 2005 8:19 PM
> To: Condor-Users Mail List
> Cc: Condor-Users Mail List
> Subject: Re: [Condor-users] jobs won't run: MY.Rank > MY.CurrentRank
>
>
>> We have a similar problem (not as many machines) but many seem to get
>> stuck in the unclaimed/idle state and will not run jobs. An analyze
>> shows the 'reject the job for unknown reasons' for these machines.
>> They ran jobs yesterday for a while but no longer will.
>>
>> Bob Orchard
>>
>
> Hi,
>
> Did something in the environment change, such
> as IP addresses or host names?
>
> When "analyze" does not give helpful information,
> there are additional places to check:
>
>   1. the job log file;
>   2. the sched daemon log file
>   3. the negotiator daemon log file.
>
> Gabriel
>
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>