Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] runtime vs persistent attributes
- Date: Wed, 09 Oct 2019 01:00:08 +0000
- From: Mary Romelfanger <mary@xxxxxxxxx>
- Subject: Re: [HTCondor-users] runtime vs persistent attributes
Hi Todd,
Your guess was correct. Option 2: I want a node to refuse to run any new jobs after completing a reboot until it is remotely turned on (but I need HTCondor to come up far enough to be able to do that remote turn on). We could add a step to reset the attribute back to "don't run" as part of any clean shut down procedure, but that does not handle any non-controlled reboot events which do happen occasionally.
Your suggestions below confirmed my suspicions about the behavior of the persistent attributes and the associated files. I had actually wondered whether removing that file would cause problems. Thank You for confirming that option. I do think that I can either add a wrapper script to the master start up or add a line to remove the file from the script that starts HTCondor at boot time. Either option should have the same effect.
I will try that and I will be back if I have any more questions!
THANK YOU
Mary
ïOn 10/8/19, 4:10 PM, "HTCondor-users on behalf of Todd Tannenbaum" <htcondor-users-bounces@xxxxxxxxxxx on behalf of tannenba@xxxxxxxxxxx> wrote:
On 10/8/2019 1:48 PM, Mary Romelfanger wrote:
> Hi everyone,
>
> I am working on setting an attribute for enabling jobs to run on a
> startd node, so that the startd will not start jobs at boot time.
>
Hi Mary,
I am not sure I understand the policy goal you stated above. Could you help me understand what it is you want to do (without any HTCondor specifics) ? Do you want to be able to mark nodes as "I want to reboot this node", and have such marked nodes refuse to start new jobs? Or do you want a node to refuse to run any new jobs after completing a reboot until it is remotely "turned on" ?
I will attempt to answer you detailed questions below, but it would really help me to help you if I better understood what your end goal is...
> I understand that runtime only attributes will not survive a daemon
> restarting, but that persistent attributes will survive a daemon
> restarting.
That is correct, assuming you are talking about "condor_config_val -rset" versus "condor_config_val -set"
> It feels like I need something in between or is a restart
> or system reboot treated as more than just a daemon restart?
>
> Does a persistent attribute survive an HTCondor restart (or a system
> rebooting meaning more than a reconfigure)?
>
Yes, a persistent attribute (condor_config_val -set) survives an HTCondor restart.
> By reading I would have guessed not, but the existence of a file for
> persistence implies that that file will be read with an initial start?
Correct.
> If it does read that persistence file at an initial start, is there a
> way to turn that off so that it does not? I want the attribute to
> return back to the default value with a full restart, but I would like
> the attribute to survive any internal startd restarts (master daemon
> saves) in between.
Perhaps a wrapper script around condor_master that does something like
#!/bin/bash
# Remove any persistent setting for startd config knob 'foo'
# Then start the condor_master
PERSIST_CONFIG_DIR=`condor_config_val persistent_config_dir 2> /dev/null`
if (( $? == 0 )); then
rm $PERSIST_CONFIG_DIR/.config.STARTD/foo
fi
exec /usr/sbin/condor_master $*
Warning! The above is off the top of my head... just something to think about, don't
cut-n-paste the above into production!
Hope the above helps,
regards
Todd
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/