[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Possible to have submit-implemented per-machine job limits?



From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
Date: 09/12/2016 02:27 PM

> Maybe you could run multiple instances of this adjunct job on one 
> physical host by using a job universe that virtualizes the network 
> environment (i.e. docker universe? vm universe?) ?

That's definitely a possibility. I've been looking for an opportunity
to dig into the Docker universe more, in particular with respect to
running Linux jobs on Windows Docker, but haven't been able to line
things up for some experiments yet, and there's some obstacles to
doing so in the project that prompted this question.

> Doing what you want via setting up a custom machine resource (i.e. 
> request_port777 = 1) is exactly what I'd suggest; scenarios like the 
> above are why custom machine resources exist, since this really is a 
> custom machine resource.  For instance, what if two different users both 

> have an app that requires the same static slot?

Good point! The current project is a very small environment, so I forgot
about this consideration, but we'd definitely want to do something
scalable - it's HTCondor after all.

> But given that you cannot configure the execute nodes, perhaps your job 
> requirements could look at the ChildRemoteUser attribute in the 
> partitionable slot?  This attribute is a classad list of all the owners 
> of dynamic slots on the machine.  You could probably leverage this so 
> only one job submitted by you runs on each machine...

Aha! That's what I was looking for! A spiffy new feature of 8.4 that I
hadn't noticed yet, right there on page 239 of the fine manual. :)

The submit description could maye set an arbitrary accounting group
string for the special job, for instance, and then use
stringList_regexpMember() to reject a match if the partitionable slot's
ChildAccountingGroup list contains it. This approach would prevent the
use of a real accounting group, of course.

So I think you're right, using the machine resource will be the way
to go. 

It's not that I can't modify the config - the issue is that the
tools and scripts for the application in question are distributed
separately from the config, and having a dependency on a specific
HTCondor config  invokes some documentation and administrative
requirements that would have been nice to avoid.

Thanks!

        -Michael Pelletier.
_