[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] GCB Performance



To ensure all nodes (or even VMs) can participate in job execution (and since you mention GCB
it sounds like you are in a multi firewall situation), you can use a submit file that sends a job
to each VM in the system in turn and does something like "hostname" (stating in your
REQUIREMENTS which machine it should run on). That might be a good smoke test.
 
I use a script that does this based on condor_status information and builds submit scripts accordingly,
it can be used to select on any classad.
 
cheers
 
JK
-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx]On Behalf Of Chris Miles
Sent: Wednesday, January 18, 2006 8:18 PM
To: sschang@xxxxxxxxxxx
Cc: condor-users@xxxxxxxxxxx
Subject: Re: [Condor-users] GCB Performance

I just done another test
one with 20 nodes, one with 15 nodes and one with 10 nodes.

And I ran longer jobs this time to make sure it was not a problem with the length of my jobs.

This job I ran 50 jobs which then run an executable 1000 times rather than 100 on
each node. I tested the Job on one of the nodes not through condor but just though
the shell and it took just over a minute to run. So im sending off 50 jobs each which

take 1 run each.

20 Nodes (40vm) : 11m 41s
15 Nodes (15vm) : 10m 54s
10 Nodes (20vm) : 11m 43s 

With overall runtime of about 50mins if one machine ran it all standalone through a shell
we are only getting the job done in a 5th of the time. So maybe only 5 machines are
actually doing anything and the rest are watching.

The other thing is... For each job I am returning 100 or 1000 output files back to the
submitting machine. Can this also be a cause of poor performance. Maybe if my script
tarred them all up leaving condor only to return one file per job?

Chris