[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] timeout reading buffer




On Mar 1, 2006, at 1:04 PM, Maxim Kovgan wrote:



On 3/1/06, Preston Smith <psmith@xxxxxxxxxx> wrote: Yea, I suppose that would've been helpful..

I could un-wedge things by holding a slew of the jobs on that schedd...
I release the whole mess of them, and can get this to happen again.
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Note that I said by releasing I jobs, I made the error re-occur.
Same error occurred at 12:50 today.


SchedLog bits attached from when the negotiation tries to run:

The only problem with this SchedLog is that  the error occured on
2/28, and your SchedLog is from 3/1

can you get a relevant SchedLog ?



-Preston






On Mar 1, 2006, at 11:55 AM, Jaime Frey wrote:

> On Feb 28, 2006, at 3:00 PM, Preston Smith wrote:
>
>> Right as our condor pools reach about 100% capacity, one of the
>> busiest
>> schedds basically stops running jobs.. almost all run down to idle..
>>
>> The negotiator logs:
>>
>> 2/28 15:44:45     Got NO_MORE_JOBS;  done negotiating
>> 2/28 15:44:45   Negotiating with user@xxxxxxxxxxxxxxx at
>> <128.211.128.11:59684>
>> 2/28 15:45:15 condor_read(): timeout reading buffer.
>> 2/28 15:45:15     Failed to get reply from schedd
>> 2/28 15:45:15   Error: Ignoring schedd for this cycle
>>
>>
>> condor_q on that schedd shows:
>> 3342 jobs; 3330 idle, 10 running, 2 held
>>
>>
>> ShadowLog on 128.211.128.11 shows:
>> 2/28 15:48:08 (21939.0) (32200): condor_read(): timeout reading
>> buffer.
>> 2/28 15:48:08 ( 21939.0) (32200): AUTHENTICATE: handshake failed!
>> 2/28 15:48:08 (21939.0) (32200): Authentication Error
>> AUTHENTICATE:1002:Failure performing handshake
>>
>>
>> Any suggestions on troubleshooting these timeouts?
>> We're running 6.6.10..
>
> The most useful information would be the schedd log of 128.211.128.11
> at the time of the timeout.
>
> +-------------------------------- +-----------------------------------+ > | Jaime Frey | I used to be a heavy gambler. | > | jfrey@xxxxxxxxxxx | But now I just make mental bets. | > | http://www.cs.wisc.edu/~jfrey/ | That's how I lost my mind. | > +-------------------------------- +-----------------------------------+
>
>
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users

--
Preston Smith  <psmith@xxxxxxxxxx>
Systems Research Engineer
Rosen Center for Advanced Computing, Purdue University





_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users



_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

--
Preston Smith  <psmith@xxxxxxxxxx>
Systems Research Engineer
Rosen Center for Advanced Computing, Purdue University