[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Disk IO



If you use SHOULD_TRANSFER_FILES=YES, in your condor_submit script you can tell condor that you want it to stage files into the spool directory of whatever computer the job ends up running on.

http://research.cs.wisc.edu/condor/manual/v7.4/2_5Submitting_Job.html#SECTION00354000000000000000

That's a great solution of your jobs are doing a bunch of random IO on the input files.

Diane

On Thu, Jun 07, 2012 at 05:26:26PM -0400, Tiago Macarios wrote:
> I dont really want to wait something for the jobs to start, just want each
> job to be scheduled on a different disk and each disk having a 8 job limit.
> (8 jobs *4 disks= 32 the same as cores)
>
> I dont think condor can have multiple EXECUTE folders, right? So I was
> thinking about doing something like:
> 1 - somehow get the disk the job is suppose to run
> 2 - copy all files to the disk
> 3 - run it there
> 4 - copy back
> 5 - clean up
>
> It just feels that it is something condor should do... disks are a resource
> too, right? Something like a EXECUTE folder per disk and a concurrency
> limit.
>
> On Thu, Jun 7, 2012 at 5:02 PM, Diane Trout <diane@xxxxxxxxxxx> wrote:
>
> > The way we tried to deal with that was by writing a small daemon and
> > helper thatinjected the fileservers load into condor.
> >
> > We then also modified the START conditional to make sure the fileservers
> > load was reasonable.
> >
> > # this condor block tries to grab the load information:
> > STARTD_CRON_JOBLIST = $(STARTD_CRON_JOBLIST) fileserver_load
> > STARTD_CRON_FILESERVER_LOAD_MODE = Periodic
> > STARTD_CRON_FILESERVER_LOAD_EXECUTABLE = /usr/bin/get_load
> > STARTD_CRON_FILESERVER_LOAD_ARGS = "-options"
> > STARTD_CRON_FILESERVER_LOAD_PERIOD=60s
> > STARTD_CRON_FILESERVER_LOAD_LOAD=1
> >
> > # The start conditionals look like:
> > CpuBusy = ((LoadAvg / TotalCpus) > 1.5) || (fileserver_load > 9.0)
> >
> > START=(CpuBusy == FALSE)
> >
> > After implementing the above solution I then learned about the concurency
> > limits which might also work.
> >
> >
> > http://research.cs.wisc.edu/condor/manual/v7.4/3_13Setting_Up.html#SECTION0041314000000000000000
> >
> > In some magical land condor would be able to what file servers a job is
> > hitting and throttle just those jobs.
> >
> > On Thu, Jun 07, 2012 at 04:47:40PM -0400, Tiago Macarios wrote:
> > > Hi All,
> > >
> > > I have a particular problem, if someone could help me I would
> > > really appreciate.
> > > I have a condor pool of approx 30 machines, this machines run jobs that
> > are
> > > disk and CPU intensive, but lately the jobs are more disk IO intensive
> > than
> > > CPU intensive. This is causing some machines to actually IDLE while the
> > > disk is seeking. These machines can run 32 processes at a time, but
> > > currently I have to enforce a limit of 8 per machine (heuristic value),
> > > because of the disk.
> > >
> > > The machines already have 4 disks each. I was wondering if there is a
> > way I
> > > could create a concurrency limit for disks and make condor copy and run
> > the
> > > files on the different disks enforcing the 8 jobs per disk. The problem
> > is
> > > that how can I get a variable that tells me which disk I should copy
> > things
> > > to?
> > >
> > > Thanks,
> > >
> > > Mac.
> >
> > > _______________________________________________
> > > Condor-users mailing list
> > > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
> > a
> > > subject: Unsubscribe
> > > You can also unsubscribe by visiting
> > > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> > >
> > > The archives can be found at:
> > > https://lists.cs.wisc.edu/archive/condor-users/
> >
> >
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/condor-users/
> >
> >

> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/

Attachment: smime.p7s
Description: S/MIME cryptographic signature