[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Restrincting the number of jobs copying data



Hi,

   I have a situation where I'd like to limit the number of nodes
which are able to read files from a file server and would appreciate
any hints on how to go about doing it.

  In summary we have about 500 CPU nodes all trying to read data from
a single file server. The way the jobs are currently set up is that
when they start they copy the data they are going to work on from the
file server to the local node, and then process that data. When the
processing is finished, they copy the data back to the file server.

 What I would like to be able to do is submit all the jobs to Condor,
then restrict the number of jobs that are allowed to be copying, as
each job finishes copying its data, the processing part of the job
will start. Allowing a new job will be started for copying data.
Each job has three parts, Part A copies data, then Part B processes
the data, then Part C writes the results. I'd like to limit the number
of jobs that are in Part A and C at any one time, while at the same
time allow any number of jobs to be in Part B.

 The 'simple' solution would be to only allow 50 jobs to run. But,
once the 50 jobs have finished copying the data, they can start doing
the processing, and then let the next 50 start copying data.

 I'm pretty much at a loss as to where to start to create this
restriction, so any hints would be appreciated.

Thank-you,
Mark Assad