[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] (Ab)using the Condor scheduler to run jobs on particular machines.



G'day.

We are deploying a Condor pool to do a bunch of the batch processing work we
have previously done using, well, any number of terrible solutions.  This is,
so far, going great, and Condor is doing the right thing on all our dedicated
and non-dedicated nodes.


The one remaining scheduling issue that I am trying to work out how to express
is the handling of "legacy" nodes, which have some special (and annoying)
properties that are not well matched to the usual Condor scheduling paradigm
as far as I can tell.


These are the *only* machines that can run some specific job, such as
performing data extraction for a particular reporting process.  Not for any
good reason, but whatever.  They are also, typically for the same reason, not
suitable for running other jobs on.

We would very much like to include them in the Condor system, though, since
they are essential to some of the processes and we would much prefer to have a
single DAGMAN submission fetch data from them, and do the rest of the
processing, without introducing another work distribution system.


As far as I can tell the only mechanism to achieve this is to add a specific
startd attribute on those machines, and add that to the job requirements:

  /etc/condor/condor_config on legacy.example.com:
  DedicatedHost = True
  STARTD_ATTRS = $(STARTD_ATTRS) DedicatedHost

  .../example.sub:
  Requirements = DedicatedHost and (Machine == "legacy.example.com")


Is there any more intelligent way to do this?  I considered using the rank
options to simply rank these machines very, very low, but that fails the
moment our pool gets busy enough to fill all the non-legacy machine slots.


Ideally, we don't want to restrict this by user, or submitting machine.

Regards,
        Daniel

Footnotes: 
[1]  In a couple of cases this is a hard, contractual, requirement that we
     don't allow certain data to run on these machine, even though it can run
     on any other machine in the pool, for data security reasons.

-- 
✣ Daniel Pittman            ✉ daniel@xxxxxxxxxxxx            ☎ +61 401 155 707
               ♽ made with 100 percent post-consumer electrons
   Looking for work?  Love Perl?  In Melbourne, Australia?  We are hiring.