[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] startd name



On Fri, Apr 3, 2020 at 1:53 PM Todd Tannenbaum <tannenba@xxxxxxxxxxx> wrote:
On 4/3/2020 11:50 AM, David Schultz wrote:
>
>Â Â ÂHi David,
>
>Â Â ÂNot sure why you mention the PID of the startd above, as the default used by the startd to name the machine (slot)
>Â Â Âclassads is the fully qualified host name of the server.... it has nothing to do with the pid.
>
>
> The collector normally sees startds registered as <slot>@<pid>@hostname, for example:
> slot1@10563@cobalt01.icecube.wisc.edu <http://cobalt01.icecube.wisc.edu>
>
> I'm not sure where that comes from, but it's nothing I did. Maybe because we're starting them as a user, instead of as
> root?
>

Actually, perhaps it is coming because you are starting either the condor_master or the condor_startd with the "-d"
command line flag? If you do a ps and grep for condor_startd, do you see "-d" on the command line? According to the
documentation at https://tinyurl.com/vbp6n9l it appears that "-d" will cause the PID to be used in the names of various
files and also the startd name. And looking at the source quick, it appears the STARTD_HOST might be ignored if "-d" is
specified (which seems like a bug to me).

Suggest you avoid the use of "-d" and instead just customize STARTD_NAME and LOCAL_DIR.

Yes, this was it! Thanks for finding this.

David

Â

>
>  ÂIn addition, each instance of HTCondor running on the same server will need their own LOCAL_DIR path. The LOCAL_DIR,
>Â Â Âspecified in the default condor_config that ships with HTCondor, is used to create the file path for the LOG, EXECUTE,
>Â Â ÂSPOOL, and LOCK subdirectories, and these subdirectories cannot be shared across multiple instances of HTCondor running
>  Âon the same server. Some of the files HTCondor will create in these sub-directories is indeed based on the PID, so
>Â Â Âperhaps this is why you mentioned PID collisions above.
>
>
> No, that shouldn't be an issue. They always get their own directories to start in.
>

Sounds good...

>
>Â Â ÂAlso, you may find it useful to check out (and maybe contribute?) our work to package an HTCondor execute node into a
>  Âcontainer. Take a look at
>Â Â Âhttps://hub.docker.com/r/htcondor/execute
>
>
> That does look interesting. Do you know if it will run in singularity as well as docker?
>

While I use Docker often, I personally haven't used Singularity much. Others on this list can correct my if I am
mistaken, but iirc my understanding is recent version of Singularity are highly compatible with Docker, even to the
point where Singularity can pull images from Docker Hub. The goal of our officially released HTCondor container images
to work with any container runtime support by Kubernetes. While we are developing and testing first with Docker, as it
is the most widely used Kubernetes container runtime interface (CRI), I believe recent versions of Singularity are also
CRI compliant, so that means good news for compatibility looking forward.

>
> The main reason to run HTCondor itself inside a container is because the underlying OS is strange, in that it is not a
> normal RHEL or Ubuntu based distro. Some sites think building their own distro is a great thing to do; we disagree.
>

That is indeed a very good reason to run HTCondor itself inside a container!

Hope the above helps David!

regards,
Todd





> David
>
>
>Â Â ÂHope the above helps,
>Â Â ÂTodd
>
>
>   > We're currently using condor version 8.6.1. While this is an older version, I don't see any obvious changes
>Â Â Âaround this
>Â Â Â > in newer versions.
>Â Â Â >
>Â Â Â > Thanks for any insight you can provide.
>Â Â Â >
>Â Â Â > David Schultz
>Â Â Â >
>Â Â Â > _______________________________________________
>Â Â Â > HTCondor-users mailing list
>Â Â Â > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx <mailto:htcondor-users-request@xxxxxxxxxxx>
>Â Â Âwith a
>Â Â Â > subject: Unsubscribe
>Â Â Â > You can also unsubscribe by visiting
>Â Â Â > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>Â Â Â >
>Â Â Â > The archives can be found at:
>Â Â Â > https://lists.cs.wisc.edu/archive/htcondor-users/
>Â Â Â >
>
>
>Â Â Â--
>Â Â ÂTodd Tannenbaum <tannenba@xxxxxxxxxxx <mailto:tannenba@xxxxxxxxxxx>> University of Wisconsin-Madison
>  ÂCenter for High Throughput Computing ÂDepartment of Computer Sciences
>  ÂHTCondor Technical Lead        1210 W. Dayton St. Rm #4257
>Â Â ÂPhone: (608) 263-7132Â Â Â Â Â Â Â Â Â Madison, WI 53706-1685
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
>

--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing ÂDepartment of Computer Sciences
HTCondor Technical Lead        1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132Â Â Â Â Â Â Â Â Â Madison, WI 53706-1685