[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] rooster on linux, take 3





On 11/29/11 3:52 PM, Dimitri Maziuk wrote:
On 11/29/2011 08:41 AM, Dan Bradley wrote:
After testing to see which machines match the job, the negotiator sorts
the matching machines and chooses the most desirable one.  If it chooses
an offline machine, it should inform the collector and update
MachineLastMatchTime.  Can you confirm from your negotiator log whether
it is choosing the offline machine or not?  From the log you posted, I
can only see that the offline machine was selected as a candidate, not
whether it was actually chosen.
This is unfortunately useless on a real-life pool. I'm getting close to
a hundred meg of


11/29/11 15:34:40 Job 977853.0 does match with slot8@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot9@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot3@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot4@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot3@xxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot4@xxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot3@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot4@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with
slot10@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot5@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with
slot11@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with
slot12@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot6@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with
slot13@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot7@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with
slot14@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot8@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with
slot15@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with
slot16@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot3@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot4@xxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot3@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot4@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot5@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot6@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40 Job 977853.0 does match with slot7@xxxxxxxxxxxxxxxxxxxxxxx
11/29/11 15:34:40       Rejected 977853.0 bmrbgrid@xxxxxxxxxxxxx
<144.92.167.254:9617?sock=13250_c2fa_3>: no match found

Without any visible indication as to why "no match found": falcon and robin are off-line.

Ugh. Looking at the negotiator code, I see that there are several possible reasons why a machine that matches the job could still be omitted from the final list of possible candidates. I'll add more debugging output to make it clear in the future which of those reasons is the actual explanation.

In the mean time, I suggest that you email condor-admin@xxxxxxxxxxx and provide two things:

1. Output of condor_status -long

2. Negotiator log for a full negotiation cycle (if possible)

--Dan