[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] GCB Performance



In Condor, we have many configuration variables set to default not for best performance in every case but for correct operations in most cases. As you have already seen, three variables played in making performance output not very understandable. Please set those variables appropriately before you try any more experiments.

Warning: However, those variables are just what I found in analyzing your log files. There may be more...

Chris Miles wrote:
Hi. Im not really testing GCB performance as such. I could only
think that GCB was causing my overall performance issue. Probably could
have done with a little better title.

So from my last test I did earlier this evening before I left campus my 1 minute jobs
were still only been executed on a few machines.

Is it not one shadow daemon per job running on the pool? I am seeing 20 to 40 shadow processes allready? I still dont really understand how 50 jobs running on a couple of VMs
can be better performance than 50 jobs running on 40 VMs.

I think I might run a experiment tomorrow to see how the performance ratio goes up as
the job workload goes up.

I presume performance would be better for long running data analysis etc ?

Chris
----- Original Message ----- From: "Se-Chang Son" <sschang@xxxxxxxxxxx>
Cc: <condor-users@xxxxxxxxxxx>
Sent: Wednesday, January 18, 2006 9:28 PM
Subject: Re: [Condor-users] GCB Performance



Chris Miles wrote:

Ok. The condor pool is made up off exactly the same spec machines. Its
an IBM Cluster.

I firstly ran a test to see how long my 50 jobs would take on just one
machine (2 VMs)
and it took 5m 11s
I then loaded up 10 nodes -- Jobs took 2m 8s
I then loaded up 20 nodes -- Jobs took 2m 22s

I think I also figured out why 20 nodes were slower than 10 nodes. The
reason might be another configuration issue. In the experiment with 20
nodes but not with 10 nodes, there was 20sec delay between job
submission and match notification from the negotiator. By default,
Condor puts 20sec delay between negotiation cycle. Please look for
NEGOTIATOR_CYCLE_DELAY variable in the manual. So, seems like you did
"20 nodes" experiment within this 20 sec delay and that made 20 nodes
slower than 10 nodes.

I don't believe how you did your experiment. However, I don't believe
that it is a good idea to evaluate GCB performance by measuring time
between submitting jobs and their completions. So many things and
coincidences affect the performance. If you are measuring Condor
performance, I would recommend you to talk with Miron.


Find attached are the logs for the submission machine from the 10 and 20
node tests.
thanks
Chris

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users



_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users