[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] GCB Performance



On 1/19/06, Chris Miles <chrismiles@xxxxxxxxxxxxxxxx> wrote:
> I have done some testing with JOB_START_COUNT and have not seen any
> improvement.
> I have tried values of 10, 20, 30, and 40. I currently have 40 VMs running
> for this experiment.
>
> The job takes a minute to run on each machine through a shell.
>
> JOB_START_COUNT = 10: 12m 23s
> JOB_START_COUNT = 20: 12m 35s
> JOB_START_COUNT = 30: 12m 20s
> JOB_START_COUNT = 40: 11m 41s
>
> So I am changing that value but im still seeing the same trend in
> performance when JOB_START_RUN
> was at its default setting (1?)
>
> Grepping the process list is showing 40 condor_shadows ??? - does this mean
> there is a job running on every VM then?
> why is it still taking 12 mins to process.
>
> my cpu and memory usage is still acceptable with that many shadows.
> condor_status is showing all machines Claimed + Busy as expected.
>
> I have not touched the JOB_START_DELAY variable? But as default that means
> for example 40 jobs will execute.. it will wait 2 minutes
> and then execute the last 10. And I would expect the overall processing time
> to be just a couple of minutes?

Some tasks simply will not scale beyond beyond a certain number of sub
tasks. You still have one machine controlling all processes, sending
out all the data and bringing it back in again.

How much data is involved - you may simply hit the hard limit based on this.

A job which only runs for 1 minute is not a good candidate for massive
parallelization via condor - the overheads it adds will be
considerable based on a job taking a minute.

See previous discussion on this list between myself and Chris Miles
titled "Problems with Jobs"

Matt