[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] GCB Performance



Yeah I shall definately try those out tomorrow. Im not really bothered about cpu or memory usage on the submitting machine. Thats only running GT4 and Condor and acts as my middleware... nothing else. I just want to get my jobs
submitting as fast as possible.

Do you think it might be worth while tarballing the numerous output files? would this help condor out a little?

Chris
----- Original Message ----- From: "Se-Chang Son" <sschang@xxxxxxxxxxx>
To: <condor-users@xxxxxxxxxxx>
Sent: Wednesday, January 18, 2006 8:29 PM
Subject: Re: [Condor-users] GCB Performance


I sent this only to Chris. So, I am posting this to the group.

Log files say that each job runs about 10sec. In order not to throttle
submit machine with too many processes running and too many files in
transit, Condor, by default, puts 2 second delay between job
invocations. This is what the manual says:

"This integer-valued macro--JOB_START_DELAY--works together with the
JOB_START_COUNT macro to throttle job starts. The condor_ schedd daemon
starts $(JOB_START_COUNT) jobs at a time, then delays for
$(JOB_START_DELAY) seconds before starting the next set of jobs. This
delay prevents a sudden, large load on the submit machine as it spawns
many condor_ shadow daemons simultaneously, and it prevents having to
deal with their start up activity all at once. The resulting job start
rate averages as fast as ($(JOB_START_COUNT)/$(JOB_START_DELAY))
jobs/second. This configuration variable is also used during the
graceful shutdown of the condor_ schedd daemon. During graceful
shutdown, this macro determines the wait time in between requesting each
condor_ shadow daemon to gracefully shut down. It is defined in terms of
seconds and defaults to 2. Setting this macro to a lower value is not
advised, as it can overwhelm the condor_ schedd daemon."

With this default configuration, your job finishes before Condor
launches all matched jobs (making machines available for jobs that are
waiting for next match). Therefore, you just need 5 ~ 6 VMs to maximize
performance in your case. Adding more machines contribute nothing and
that's why you get basically the same performance with 20 VMs and 40VMs.


Chris Miles wrote:
Ok. The condor pool is made up off exactly the same spec machines. Its
an IBM Cluster.

I firstly ran a test to see how long my 50 jobs would take on just one
machine (2 VMs)
and it took 5m 11s
I then loaded up 10 nodes -- Jobs took 2m 8s
I then loaded up 20 nodes -- Jobs took 2m 22s
Find attached are the logs for the submission machine from the 10 and 20
node tests.
thanks
Chris

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users