[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] startd name



On 4/3/2020 11:50 AM, David Schultz wrote:

    Hi David,

    Not sure why you mention the PID of the startd above, as the default used by the startd to name the machine (slot)
    classads is the fully qualified host name of the server.... it has nothing to do with the pid.


The collector normally sees startds registered as <slot>@<pid>@hostname, for example:
slot1@10563@cobalt01.icecube.wisc.edu <http://cobalt01.icecube.wisc.edu>

I'm not sure where that comes from, but it's nothing I did. Maybe because we're starting them as a user, instead of as root?


Actually, perhaps it is coming because you are starting either the condor_master or the condor_startd with the "-d" command line flag? If you do a ps and grep for condor_startd, do you see "-d" on the command line? According to the documentation at https://tinyurl.com/vbp6n9l it appears that "-d" will cause the PID to be used in the names of various files and also the startd name. And looking at the source quick, it appears the STARTD_HOST might be ignored if "-d" is specified (which seems like a bug to me).

Suggest you avoid the use of "-d" and instead just customize STARTD_NAME and LOCAL_DIR.


    In addition, each instance of HTCondor running on the same server will need their own LOCAL_DIR path. The LOCAL_DIR,
    specified in the default condor_config that ships with HTCondor, is used to create the file path for the LOG, EXECUTE,
    SPOOL, and LOCK subdirectories, and these subdirectories cannot be shared across multiple instances of HTCondor running
    on the same server. Some of the files HTCondor will create in these sub-directories is indeed based on the PID, so
    perhaps this is why you mentioned PID collisions above.


No, that shouldn't be an issue. They always get their own directories to start in.


Sounds good...


    Also, you may find it useful to check out (and maybe contribute?) our work to package an HTCondor execute node into a
    container. Take a look at
    https://hub.docker.com/r/htcondor/execute


That does look interesting. Do you know if it will run in singularity as well as docker?


While I use Docker often, I personally haven't used Singularity much. Others on this list can correct my if I am mistaken, but iirc my understanding is recent version of Singularity are highly compatible with Docker, even to the point where Singularity can pull images from Docker Hub. The goal of our officially released HTCondor container images to work with any container runtime support by Kubernetes. While we are developing and testing first with Docker, as it is the most widely used Kubernetes container runtime interface (CRI), I believe recent versions of Singularity are also CRI compliant, so that means good news for compatibility looking forward.


The main reason to run HTCondor itself inside a container is because the underlying OS is strange, in that it is not a normal RHEL or Ubuntu based distro. Some sites think building their own distro is a great thing to do; we disagree.


That is indeed a very good reason to run HTCondor itself inside a container!

Hope the above helps David!

regards,
Todd





David


    Hope the above helps,
    Todd


     > We're currently using condor version 8.6.1. While this is an older version, I don't see any obvious changes
    around this
     > in newer versions.
     >
     > Thanks for any insight you can provide.
     >
     > David Schultz
     >
     > _______________________________________________
     > HTCondor-users mailing list
     > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx <mailto:htcondor-users-request@xxxxxxxxxxx>
    with a
     > subject: Unsubscribe
     > You can also unsubscribe by visiting
     > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
     >
     > The archives can be found at:
     > https://lists.cs.wisc.edu/archive/htcondor-users/
     >


-- Todd Tannenbaum <tannenba@xxxxxxxxxxx <mailto:tannenba@xxxxxxxxxxx>> University of Wisconsin-Madison
    Center for High Throughput Computing ÂDepartment of Computer Sciences
    HTCondor Technical Lead        1210 W. Dayton St. Rm #4257
    Phone: (608) 263-7132Â Â Â Â Â Â Â Â Â Madison, WI 53706-1685


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing   Department of Computer Sciences
HTCondor Technical Lead                1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132                  Madison, WI 53706-1685