[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] ERROR: Failed to connect to local queue manager
- Date: Fri, 24 Mar 2017 22:03:33 +0000
- From: Michael Pelletier <Michael.V.Pelletier@xxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] ERROR: Failed to connect to local queue manager
I found back when we first switched to HTCondor, users would loop through condor_submit calls since that's how they used to do it with Grid Engine qsub. It took some time for them to get the hang of how $(Process) and queue worked. Once they did, they were pleased to submit 1,000 Monte Carlo runs in seconds, instead of 20 minutes. And in 8.4, with the queue in/matching/from feature, it's another bit of learning curve and code changes to simplify things even more and make submissions that much more efficient.
Users running a zillion condor_submits will never regret learning how to properly set up a "queue 1 zillion".
> -----Original Message-----
> From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On
> Behalf Of Todd L Miller
> Sent: Wednesday, March 22, 2017 10:41 AM
> It looks like the schedd's too busy, probably dealing with all the
> submits that just happened, and isn't answering. Check the schedd's log to
> see what it's up to. If your user is calling condor_submit frequently, it may
> help to combine submits into one (using the extra queue commands), or -- as
> a last resort -- use DAGMan to help throttle submits.
> - ToddM