[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Support for user-driven tiered jobs?



There is one simple thing we have done that might help.
We do have a condor-G job submission facility here at Fermilab
that doesn't involve glideins. Say there is a 2-user group with user A and user B. User "B" may have 2000 grid universe
jobs submitted and pending, but then 5 high priority jobs from user A need
to get to the front of the queue.  that can be accomplished
by the "priority" field in the job classad, which works
either for vanilla universe jobs or grid universe jobs.  similarly,
with the right combination of single and double quotes and backslashes
you can send a "priority" field to a remote grid site such that
if usr "B" already has 400 jobs pending, user "A" can get to the head
of the line and get his job started--even if the remote grid site
happens to map user "A" and user "B" to the same user in condor.

Steve


On Thu, 4 Feb 2010, Ian Stokes-Rees wrote:

We are faced with a problem where jobs from one user within a group
block out all jobs by another user within the same user group.  We would
like to have a tiered system that is *user* driven to specify "Within my
user group, place this job on Tier N", so that a large set of jobs by
one user (i.e. possibly several weeks of full cluster usage) could be
set at a lower priority, thereby allowing jobs from another user within
the same group to take priority.  Sort of putting the jobs into a
"backfill" mode within the user group, but NOT "overall backfill" --
they should compete equally with all jobs from other user groups.

Then, when a machine classad is being matched, and user A and user B,
both within the same user group, have two jobs X and Y respectively
that  in all other ways are identical, the match will prefer the job
with the higher (better) tier.

Right now we can't see how that is possible from the user side -- users
can't put in their Ranking expression anything that refers to other jobs
that are being considered at the same time.  A job can only consider the
machine it is being matched against.  It is the machine that has the
power to decide between jobs.

I realize there are some simple things that can be put into the machine
Rank expression, however my situation is complicated by the fact that
this is happening in a grid environment, so:

* The names of the user groups is large and not easily listed/known a
priori.

* The implementation of this policy would  need to be done by many
independent condor deployments, so would need to have minimal or no
effect on jobs that do not want to use this system.

* The implementation of this policy would need to be (relatively) simple
in order to get agreement.

* Ideally some clever user-based solution would be possible that would
not require changing machine classads.

For completeness, we are aware that glide-in frameworks often provide
this kind of functionality, however we're trying to see what we can
achieve without considering glide-in (yet).

Ian


--
------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.