[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Disk IO



That sounds like a bug. Will you open a ticket on http://condor-wiki.cs.wisc.edu for it?

Best,


matt

On 06/25/2012 06:29 AM, Tiago Macarios wrote:
Hey Matthew,

Thanks a lot for this. Now everything works perfectly.
Just wondering that condor is not "getting" the right size of the disks,
do I need to configure it somewhere else?
I have mounted:
/dev/sdd /var/lib/condor/execute1
/dev/sdc /var/lib/condor/execute2
/dev/sdd /var/lib/condor/execute3
/dev/sde /var/lib/condor/execute4

All disk are 1 tera and are empty, but condor thinks they are full.
Looks like it is getting the space from the EXECUTE and not the
SLOT<N>_EXECUTE. The way I found to bypass it is adding:
request_disk = 0
requirements = $(TARGET.Disk>=0)

as requirements... Any idea?

On Fri, Jun 8, 2012 at 2:30 AM, Matthew Farrellee <matt@xxxxxxxxxx
<mailto:matt@xxxxxxxxxx>> wrote:

    http://research.cs.wisc.edu/__condor/manual/v7.6/3___3Configuration.html#15621
    <http://research.cs.wisc.edu/condor/manual/v7.6/3_3Configuration.html#15621>

    SLOT<N>_EXECUTE

    Enjoy.

    Best,


    matt


    On 06/07/2012 05:26 PM, Tiago Macarios wrote:

        I dont really want to wait something for the jobs to start, just
        want
        each job to be scheduled on a different disk and each disk
        having a 8
        job limit. (8 jobs *4 disks= 32 the same as cores)

        I dont think condor can have multiple EXECUTE folders, right? So
        I was
        thinking about doing something like:
        1 - somehow get the disk the job is suppose to run
        2 - copy all files to the disk
        3 - run it there
        4 - copy back
        5 - clean up

        It just feels that it is something condor should do... disks are a
        resource too, right? Something like a EXECUTE folder per disk and a
        concurrency limit.

        On Thu, Jun 7, 2012 at 5:02 PM, Diane Trout <diane@xxxxxxxxxxx
        <mailto:diane@xxxxxxxxxxx>
        <mailto:diane@xxxxxxxxxxx <mailto:diane@xxxxxxxxxxx>>> wrote:

            The way we tried to deal with that was by writing a small
        daemon and
            helper thatinjected the fileservers load into condor.

            We then also modified the START conditional to make sure the
            fileservers load was reasonable.

            # this condor block tries to grab the load information:
            STARTD_CRON_JOBLIST = $(STARTD_CRON_JOBLIST) fileserver_load
            STARTD_CRON_FILESERVER_LOAD___MODE = Periodic
            STARTD_CRON_FILESERVER_LOAD___EXECUTABLE = /usr/bin/get_load
            STARTD_CRON_FILESERVER_LOAD___ARGS = "-options"
            STARTD_CRON_FILESERVER_LOAD___PERIOD=60s
            STARTD_CRON_FILESERVER_LOAD___LOAD=1

            # The start conditionals look like:
            CpuBusy = ((LoadAvg / TotalCpus) > 1.5) || (fileserver_load
         > 9.0)

            START=(CpuBusy == FALSE)

            After implementing the above solution I then learned about the
            concurency limits which might also work.

        http://research.cs.wisc.edu/__condor/manual/v7.4/3___13Setting_Up.html#__SECTION0041314000000000000000
        <http://research.cs.wisc.edu/condor/manual/v7.4/3_13Setting_Up.html#SECTION0041314000000000000000>

            In some magical land condor would be able to what file
        servers a job
            is hitting and throttle just those jobs.

            On Thu, Jun 07, 2012 at 04:47:40PM -0400, Tiago Macarios wrote:
             > Hi All,
             >
             > I have a particular problem, if someone could help me I would
             > really appreciate.
             > I have a condor pool of approx 30 machines, this machines run
            jobs that are
             > disk and CPU intensive, but lately the jobs are more disk IO
            intensive than
             > CPU intensive. This is causing some machines to actually IDLE
            while the
             > disk is seeking. These machines can run 32 processes at a
        time, but
             > currently I have to enforce a limit of 8 per machine
        (heuristic
            value),
             > because of the disk.
             >
             > The machines already have 4 disks each. I was wondering
        if there
            is a way I
             > could create a concurrency limit for disks and make
        condor copy
            and run the
             > files on the different disks enforcing the 8 jobs per
        disk. The
            problem is
             > that how can I get a variable that tells me which disk I
        should
            copy things
             > to?
             >
             > Thanks,
             >
             > Mac.

             > _________________________________________________
             > Condor-users mailing list
             > To unsubscribe, send a message to
        condor-users-request@xxxxxxxxxxxxx
        <mailto:condor-users-request@xxxxxxxxxxx>
            <mailto:condor-users-request@xxxxxxxxxxxxx
        <mailto:condor-users-request@xxxxxxxxxxx>> with a

             > subject: Unsubscribe
             > You can also unsubscribe by visiting
             > https://lists.cs.wisc.edu/__mailman/listinfo/condor-users
        <https://lists.cs.wisc.edu/mailman/listinfo/condor-users>
             >
             > The archives can be found at:
             > https://lists.cs.wisc.edu/__archive/condor-users/
        <https://lists.cs.wisc.edu/archive/condor-users/>


            _________________________________________________
            Condor-users mailing list
            To unsubscribe, send a message to
        condor-users-request@xxxxxxxxxxxxx
        <mailto:condor-users-request@xxxxxxxxxxx>
            <mailto:condor-users-request@xxxxxxxxxxxxx
        <mailto:condor-users-request@xxxxxxxxxxx>> with a

            subject: Unsubscribe
            You can also unsubscribe by visiting
        https://lists.cs.wisc.edu/__mailman/listinfo/condor-users
        <https://lists.cs.wisc.edu/mailman/listinfo/condor-users>

            The archives can be found at:
        https://lists.cs.wisc.edu/__archive/condor-users/
        <https://lists.cs.wisc.edu/archive/condor-users/>




        _________________________________________________
        Condor-users mailing list
        To unsubscribe, send a message to
        condor-users-request@xxxxxxxxxxxxx
        <mailto:condor-users-request@xxxxxxxxxxx> with a
        subject: Unsubscribe
        You can also unsubscribe by visiting
        https://lists.cs.wisc.edu/__mailman/listinfo/condor-users
        <https://lists.cs.wisc.edu/mailman/listinfo/condor-users>

        The archives can be found at:
        https://lists.cs.wisc.edu/__archive/condor-users/
        <https://lists.cs.wisc.edu/archive/condor-users/>