[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Trivial jobs occasionally running for hours
- Date: Fri, 15 Oct 2010 07:10:03 -0400
- From: Matthew Farrellee <matt@xxxxxxxxxx>
- Subject: Re: [Condor-users] Trivial jobs occasionally running for hours
(inline your trimmed top post)
On 10/13/2010 12:14 PM, Paul Haldane wrote:
[I wouldn't normally top-post but best I can do without losing
Thanks - Mark Calleja pointed me at the error that I was consistently
overlooking. I'd also had had one of those "doh" moments shortly
after sending the original message and had spotted the (now very
obvious) error in the logs - "Sock::bindWithin - failed to bind any
port within (9600 ~ 9700)".
Yeah, that's no fun. If you're using LOWPORT/HIGHPORT you should
I've changed the range to 9600-19700 and restarted. I don't get the
errors but first attempt only ran one job at a time and second isn't
running any (though they're all nicely queued). I suspect this is a
completely unrelated problem.
I have wondered whether as you suggest using trivial jobs for testing
is unfair on the system. I should probably set up a more substantial
test job (and be more patient).
Trivial jobs aren't really unfair, but if you want to stress the system
with short jobs you should run the 7.5 series. It has a number of
optimizations for shorter jobs, including recycling shadows to avoid
process management overhead.