[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor and Oracle



ivailo penev wrote:
> Hello colleagues,
> 
> I have constructed a condor pool with four Windows machines. All the
> computers have same characteristics. I have started batch jobs in the
> pool. Each job calculates, analyzes data and finally writes results
> into an Oracle database. The Oracle is installed on a separate
> machine. I have noted the execution time for the simulation in the
> pool. Afterwards I have tried to execute the same jobs sequentially
> by a single machine in order to compare the times. The comparison
> shows that the execution of one job by a single computer is about a
> half time less than the execution time of the same job in the pool. 
> The simulation in the pool is expected to give about four times
> effectiveness than the same jobs, executed by one computer. Is it
> possible the simultaneous access to Oracle to make the whole process
> slower? Or the reason could be another? I have plans to extend the
> pool to run real financial jobs, but, due to these results, I am not
> sure if it is worth doing.
> 
> Thanks in advance, Ivaylo Penev
> 
> Department of Computer Technique and Automation Technical University
> of Varna, Bulgaria

If you share a bit more about your benchmarking methodology people may
have some ideas, e.g. how are you measuring runtimes, what do you know
about the execution phases of your job. It is possible Oracle is a
bottleneck for you - have you tried running the jobs in parallel outside
of Condor?

It wasn't clear to me if you are comparing the total execution for the
jobs in parallel vs the time in serial, or if you compared one of the
parallel jobs to one of the serial. It seems more useful to do the former.

For example:

 parallel            | serial
  job #   exec time  |  job #   exec time
   1       1h        |   1       30m
   2       1h        |   2       30m
   3       1h        |   3       30m
   4       1h        |   4       30m
  wall clock: 1h + c |   wall clock: 2h

job 1 in serial runs in half the time, but the batch of 4 in parallel
complete in about half the time.

Best,


matt