[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_submit next_job_start_delay is not working



Hey.. Thanks for responding. I am added my comments inline.Â

On Wed, Apr 15, 2015 at 5:20 PM, Krieger, Donald N. <kriegerd@xxxxxxxx> wrote:

I wonder if the start_delay stuff you are trying is designed only to effect the queueing and not with the run time.

In any case, here are a couple of thoughts which might be helpful.


I am submitting ec2 grid jobs to aws using dagman. While submitting grid jobs to aws, I am gettingÂServer.InsufficientInstanceCapacity error from AWS as I am submitting 10 jobs at same point of time, in total i am submitting 100 jobs inÂ10 minutes.Â

Then, I decreased submitting jobs count to 2 and in total 20 jobs in 10 minutes. Even though I am facingÂServer.InsufficientInstanceCapacity error. I am facingÂServer.InsufficientInstanceCapacity error only for specific job which triggers 3 instances m3.2xlarge instance type at same point of time. I want to put some delay in queue execution so that it won't request instances with same instance type at same point of time.

make sense? any better way of handling it?Â

First, it might be worthwhile to describe why you want the delays.

Someone may see from that another way to accomplish that.


This might not work with in current scenario because I am submitting a dag to run jobs. Dagman is submitting jobs.Â

Second, hereâs a brute force approach. You can control the run time of each job in a sequence by issuing your condor_submit commands with a script which does the following:

foreach num (`echo â1 2 3 4 5â`)

 condor_submit Job${num}

 Âwait till Job${num}.log appears

ÂÂ poll Job${num}.log for line which appears when the job actually begins.

ÂÂÂÂ When that happens there will be a line something like this:

ÂÂÂÂ 001 (16974323.000.000) 04/15 09:19:22 Job executing on host: <10.1.13.52:47173?CCBID=129.79.53.179:9899#448633&noUDP>

ÂÂ Sleep 5 seconds

end

Â

Regards,

Â

Don

Â

Signature0001

Don Krieger, Ph.D.

Department of Neurological Surgery

University of Pittsburgh

Â

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Sridhar Thumma
Sent: Wednesday, April 15, 2015 7:22 AM
To: HTCondor-Users Mail List
Subject: Re: [HTCondor-users] condor_submit next_job_start_delay is not working

Â

Hi,


Anyone has any idea? I'm blocked because of this issue..Â

Â

Â

On Wed, Apr 15, 2015 at 12:20 AM, Sridhar Thumma <deadman.den@xxxxxxxxx> wrote:

I tried deferral_time and deferral_prep_time like below. Still no luck. all jobs are starting at same point. I want condor to execute with delay between queues. I have one job with 5 queues in it.

Â

Something like this:

deferral_time = (time() + 600)
deferral_prep_time = 90
queue
...
deferral_time = (time() + 600)
deferral_prep_time = 180
queue
...
deferral_time = (time() + 600)
deferral_prep_time = 270
queue

Â

Can anyone please help me out in this.

Â

On Tue, Apr 14, 2015 at 11:15 PM, Sridhar Thumma <deadman.den@xxxxxxxxx> wrote:

Hi,

Â

I have a submit file which has 5 queues in it. I want condor to delay few seconds between each queue execution. I triedÂnext_job_start_delay=90. But it didn't work.Â

Â

How to delay submitting queues with delay between each of them.

Â

submit file:

Â

Rank = 500 - TotalLoadAvg

next_job_start_delay=90

Queue

ec2_user_data_file = userdata_2.shÂ

Rank = 500 - TotalLoadAvg

next_job_start_delay=90

Queue

ec2_user_data_file = userdata_3.shÂ

Rank = 500 - TotalLoadAvg

next_job_start_delay=90

Queue

ec2_user_data_file = userdata_3.shÂ

Rank = 500 - TotalLoadAvg

next_job_start_delay=90

Queue

ec2_user_data_file = userdata_3.shÂ

Rank = 500 - TotalLoadAvg

next_job_start_delay=90

Queue

Â

Â


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/