[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Spawning jobs depending on system load and start_delay, start_count



Heiko Burghardt wrote:
> Hello,
> 
> I just started studying the documentation of condor and did some tests
> with it. Condor is very complex and flexible as far as I can see at my
> stage of knowledge about it but I still have not found on how I can
> configure Condor to solve a problem for us. So, I hope anyone on this
> list may help me.
> 
> In detail:
> We have a partner we are sending messages to. For each message we send
> we normally receive between 3 and 10 messages in return after some time.
> Finally, we talk about thousands of messages a day.
> The sending and receiving part is done by scripts that connect to a
> database engine, protocol the message and to find references for further
> processing. Sometimes we are sending /receiving so many messages that
> the load of the system explodes rapidly, only because of too many
> concurrent jobs.
> That's why we like to establish Condor as queuing mechanism because it
> enables us to rectify the mass of jobs for a better throughput on the
> system. But until now I was only able to find out on how we can start a
> number of jobs in a certain time frame. To do so I use the local
> universe for the jobs and the following parameters in the configuration
> file:
> 
> JOB_START_DELAY = 1
> JOB_START_COUNT = 5
> 
> Actually, this works fine for queuing the jobs in general. But we like
> to combine this feature with a load dependency rule in Condor, means:
> stop to run jobs at a load higher than value X. That's where I fail as
> the UWCS_START, UWCS_SUSPEND, ... rules do not seem to work in the local
> universe. Using e.g. the scheduler universe the situation is visa verce:
> load dependency works (with some restrictions) but there is no
> restriction in number of jobs in a time frame.
> 
> Does somebody have an idea on how to set up Condor for our requirements?
> 
> Thanks in advance.
> 
> Best regards,
> Heiko
> 

Have you considered running a condor_startd next to your condor_schedd? That can get you both types of policy.

A little known feature of the Schedd even provides you a way to skip using a Collector and Negotiator.

http://www.cs.wisc.edu/condor/manual/v7.4/3_3Configuration.html#19407

Best,


matt