Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Whole System Scheduling

Date: Fri, 23 Oct 2009 11:50:05 -0400
From: "Jonathan D. Proulx" <jon@xxxxxxxxxxxxx>
Subject: [Condor-users] Whole System Scheduling

Hi All,

I've been trying to get  whoel system scheduling working on my pool
for some months now and it is becoming a rather critial issue.

I've been basing my config off of http://nmi.cs.wisc.edu/node/1482

Ideally I'd like

  1) Whole system jobs _must_ not run untill they have the whole system
  2) Non "PriorityGroup" (predefined in config) jobs _should_ be
    preempted when a "PriortyGroup" whole system  job is scheduled
  3) Whole system jobs _should_ be suspended untill all single slot
    "PriorityGroup" jobs complete

Point one is critical as much of the code users are looking to
schedule in this way is benchmark code that is only meaningful if the
rest of the system is quiescent.

Igoring non Priority group users for now and trying to simply have
suspend the whole system job untill all other slots are clear fails at
MaxSuspendTime, which is understandable except the job does execute
for some period of time before being killed and requeued (usually in
the exact same slot)

Though if #2 and/or  #3 are difficult
  4) All single slot jobs _may_ be preempted by a whole system job

Trying #4 I see that the whole system job gets schedule on slot one
and starts running.  Slots 2 through N continue executing for 3min (a
negotiation cycle?) before they exit.

My fondest wish would be for Condor to be able to allocate multiple CPUs and
jobs could simply require some number (which they could if I
configured a matrix of mutually exlusive slots I guess but as we get
up in to the world of 16 and more cores this gets crazy)

Help?
-Jon

Follow-Ups:
- Re: [Condor-users] Whole System Scheduling
  - From: Jonathan D. Proulx
- Re: [Condor-users] Whole System Scheduling
  - From: Dan Bradley
- Re: [Condor-users] Whole System Scheduling
  - From: Ian Chesal
- Re: [Condor-users] Whole System Scheduling
  - From: Ian Chesal
- Re: [Condor-users] Whole System Scheduling
  - From: Ioan Raicu

Prev by Date: Re: [Condor-users] Fwd: condor help
Next by Date: Re: [Condor-users] Whole System Scheduling
Previous by thread: Re: [Condor-users] Fwd: condor help
Next by thread: Re: [Condor-users] Whole System Scheduling
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

[Condor-users] Whole System Scheduling