[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Disk IO
- Date: Thu, 7 Jun 2012 17:26:26 -0400
- From: Tiago Macarios <tiagomacarios@xxxxxxxxx>
- Subject: Re: [Condor-users] Disk IO
I dont really want to wait something for the jobs to start, just want each job to be scheduled on a different disk and each disk having a 8 job limit. (8 jobs *4 disks= 32 the same as cores)
I dont think condor can have multiple EXECUTE folders, right? So I was thinking about doing something like:
1 - somehow get the disk the job is suppose to run
2 - copy all files to the disk
3 - run it there
4 - copy back
5 - clean up
It just feels that it is something condor should do... disks are a resource too, right? Something like a EXECUTE folder per disk and a concurrency limit.
On Thu, Jun 7, 2012 at 5:02 PM, Diane Trout <diane@xxxxxxxxxxx>
The way we tried to deal with that was by writing a small daemon and helper thatinjected the fileservers load into condor.
We then also modified the START conditional to make sure the fileservers load was reasonable.
# this condor block tries to grab the load information:
STARTD_CRON_JOBLIST = $(STARTD_CRON_JOBLIST) fileserver_load
STARTD_CRON_FILESERVER_LOAD_MODE = Periodic
STARTD_CRON_FILESERVER_LOAD_EXECUTABLE = /usr/bin/get_load
STARTD_CRON_FILESERVER_LOAD_ARGS = "-options"
# The start conditionals look like:
CpuBusy = ((LoadAvg / TotalCpus) > 1.5) || (fileserver_load > 9.0)
START=(CpuBusy == FALSE)
After implementing the above solution I then learned about the concurency limits which might also work.
In some magical land condor would be able to what file servers a job is hitting and throttle just those jobs.
On Thu, Jun 07, 2012 at 04:47:40PM -0400, Tiago Macarios wrote:
> Hi All,
> I have a particular problem, if someone could help me I would
> really appreciate.
> I have a condor pool of approx 30 machines, this machines run jobs that are
> disk and CPU intensive, but lately the jobs are more disk IO intensive than
> CPU intensive. This is causing some machines to actually IDLE while the
> disk is seeking. These machines can run 32 processes at a time, but
> currently I have to enforce a limit of 8 per machine (heuristic value),
> because of the disk.
> The machines already have 4 disks each. I was wondering if there is a way I
> could create a concurrency limit for disks and make condor copy and run the
> files on the different disks enforcing the 8 jobs per disk. The problem is
> that how can I get a variable that tells me which disk I should copy things
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> The archives can be found at:
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
You can also unsubscribe by visiting
The archives can be found at: