[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Requirements statement to exclude a list of machines in a file



Greg:

What you want is a startd classad hook that periodically queries this particular hostname and publishes a boolean attribute which evaluates to True if the resolution succeeds and False otherwise. Then direct the user to express this as a job requirement.

https://htcondor.readthedocs.io/en/latest/misc-concepts/hooks.html#index-75

Hooks can be used for other purposes, but here the idea is for itÂto define a measure of its own health and express it as an attribute. Honestly, if DNS is flaky you might simply consider making this attribute a requirement for all jobs via APPEND_REQUIREMENTS.

https://htcondor.readthedocs.io/en/latest/admin-manual/configuration-macros.html#condor-submit-configuration-file-entries

Tom

On Wed, May 12, 2021 at 7:32 PM Hitchen, Greg (IM&T, Kensington WA) <Greg.Hitchen@xxxxxxxx> wrote:

Thanks Vikrant and Todd for your suggestions. Iâll give them a try and see how it goes.

Â

Motivation is for a temporary kludge while our Networks Team (and Active Directory DNS Team) sort out

a DNS issue within âsomeâ of our wireless VLANs.

Â

Now that we are including laptop machines as execute nodes, many of these hook into the local on-site

wifi systems. Some are also connected directly via wired ethernet at the same time.

Â

These laptops and wireless networks are spread over multiple sites around the country, and there seems

to be some possible firewall issues.

Â

nslookup fails for some wireless VLANs, weirdly enough when talking to a DNS server via IPv6, but not IPv4.

A laptop in a different wireless VLAN can talk to the same DNS server via IPv6 OK.

Â

This userâs jobs talk to a specific hostname, so jobs fail if this name is not resolved. The executable is sort

of 3rd party, so using an IP address instead will require a recompile, which could take weeks and is out of

our immediate control. His failed jobs requeue but it is affecting his overall job throughput.

Â

So, just trying to keep a user happy while the things I have no control over get sorted out.

Â

Cheers

Â

Greg

Â

From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
Sent: Thursday, 13 May 2021 4:25 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>; Hitchen, Greg (IM&T, Kensington WA) <Greg.Hitchen@xxxxxxxx>
Subject: Re: [HTCondor-users] Requirements statement to exclude a list of machines in a file

Â

On 5/11/2021 10:39 PM, Hitchen, Greg (IM&T, Kensington WA) wrote:

Hi All
Â
Quick question.
Â
I know how to exclude a certain machine (or a few), e.g. with the requirements = (Machine =!= "unwanted_machine.something)
type of statement, or even using it with regexp, but, what about a list of 200+ randomly named machines listed in a file?
Â
Is there a quick/easy/hard/kludgy/dirty way of doing this?


Hi Greg,

Curious, what is your motivation for this?

At any rate, a quick-n-dirty example off the top of my head would look like the below. Maybe there is a better / more elegant way, especially if you care to use Python, but this is the first thing that came to mind using the command-line tools:

Contents of file "badlist.txt":

# List of machines to avoid, note the backslash
# character serving as a line-continuation at the end
# of each line.
BadList = \
foo.xxx.edu \
bar.xxx.edu \
alpha.xxx.edu \
beta.xxx.edu

Contents of your submit file:

executable = foo.exe
requirements = stringListIMember(Machine,"$(BadList)")==False
include : badlist.txt
queue



Hope this helps,
Todd

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/