[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] 2 match but reject the job for unknown reasons



Hello,

We have found why the machine was rejecting our jobs. We had commented
important lines in the local configuration file and forgot to
uncomment them. Sorry for the trouble and thank you for the tips.

Diana

On Fri, Feb 26, 2010 at 3:31 PM, Steven Timm <timm@xxxxxxxx> wrote:
> One of the many "unknown reasons" can be that the machine has
> got a RANK statement or a START statement that works out
> to have value UNDEFINED.
>
> condor_q -ana -l <jobid> will give you the latest reason why the
> job was rejected for "unknown reasons".  NegotiatorLog
> at D_MATCH level of debug or greater may give you some clues too.
>
> Steve
>
> On Fri, 26 Feb 2010, Steffen Grunewald wrote:
>
>> On Fri, Feb 26, 2010 at 11:33:25AM +0000, Diana Lousa wrote:
>>>
>>> Dear all,
>>>
>>> We have a condor pool with several cores and lately we realized that
>>> one of our machines rejects all the jobs for unknown reasons. Here is
>>> the result of the command " condor_q -analyze" for a given job that
>>> was rejected:
>>>
>>> condor_q -analyze 475178.0
>>
>> Try -better-analyze ...
>>
>>
>
> --
> ------------------------------------------------------------------
> Steven C. Timm, Ph.D  (630) 840-8525
> timm@xxxxxxxx  http://home.fnal.gov/~timm/
> Fermilab Computing Division, Scientific Computing Facilities,
> Grid Facilities Department, FermiGrid Services Group, Assistant Group
> Leader.
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>



-- 
Diana Lousa
PhD student
Protein Modeling Laboratory
ITQB/UNL
Oeiras, Portugal