[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] ERROR: Failed to connect to local queue manager



I found back when we first switched to HTCondor, users would loop through condor_submit calls since that's how they used to do it with Grid Engine qsub. It took some time for them to get the hang of how $(Process) and queue worked. Once they did, they were pleased to submit 1,000 Monte Carlo runs in seconds, instead of 20 minutes. And in 8.4, with the queue in/matching/from feature, it's another bit of learning curve and code changes to simplify things even more and make submissions that much more efficient.

Users running a zillion condor_submits will never regret learning how to properly set up a "queue 1 zillion".

	-Michael Pelletier.

> -----Original Message-----
> From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On
> Behalf Of Todd L Miller
> Sent: Wednesday, March 22, 2017 10:41 AM
>  	It looks like the schedd's too busy, probably dealing with all the
> submits that just happened, and isn't answering.  Check the schedd's log to
> see what it's up to.  If your user is calling condor_submit frequently, it may
> help to combine submits into one (using the extra queue commands), or -- as
> a last resort -- use DAGMan to help throttle submits.
> 
> - ToddM