[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor 9.1



I think slots are appearing as localhost because your condor_config is telling condor to use localhost as the primary network interface.  

What does the condor_config have set for NETWORK_INTERFACE ?

Try running

   condor_config_val -v NETWORK_INTERFACE

By the way, you can see all of your configuration that differs from the default HTCondor configuration by running

    condor_config_val -summary

When a job runs, files will be written as nobody if the job runs as nobody, which happens when HTCondor does not think that the submit node and the execute node have the same set of user ids.  It decides this by comparing the value of UID_DOMAIN on both of these machines. 

Try running

    condor_config_val  -v UID_DOMAIN

on both the submit machine and the execute machine, what is the value?

Now having files writting as nobody on the execute node is not a problem when HTCondor is doing file transfer, because it will change ownership of the files as it transfers the results back.  but if you are using a shared file system
you may need to do some additional configuration. 

Instructions for setting up HTCondor to use shared files system is here

Configuration Macros — HTCondor Manual 9.1.0 documentation


-tj



From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Lyle Pakula <Lyle@xxxxxxxxxxxxxxxx>
Sent: Monday, July 26, 2021 7:14 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] HTCondor 9.1
 
Hi Everyone and thanks for everyone's help in advance!

We have recently upgraded from a very old install of 7.6 to 9.1 on ubuntu 18.04 by basically blowing away everything old (uninstall, remove systemctl, delete "condor user" from all machines) and then following https://htcondor.readthedocs.io/en/latest/getting-htcondor/admin-quick-start.html.

* Starting with a basic setup (3 Machines, 3 roles) + NAS mounted on all machines. 
* Vanilla universe Jobs read/write to and from the NAS 

Question 1 - Why are slots apearing as "localhost" and not the machine name they are actually on?
lyle@tuna:~$ condor_status
Name            OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime

slot1@localhost LINUX      X86_64 Unclaimed Idle      0.000 1990  0+00:30:39
slot2@localhost LINUX      X86_64 Unclaimed Idle      0.000 1990  0+00:30:36
slot3@localhost LINUX      X86_64 Unclaimed Idle      0.000 1990  0+00:30:33
slot4@localhost LINUX      X86_64 Unclaimed Idle      0.000 1990  0+00:30:32
slot5@localhost LINUX      X86_64 Unclaimed Idle      0.000 1990  0+00:30:31
slot6@localhost LINUX      X86_64 Unclaimed Idle      0.000 1990  0+00:30:42
slot7@localhost LINUX      X86_64 Unclaimed Idle      0.000 1990  0+00:30:41
slot8@localhost LINUX      X86_64 Unclaimed Idle      0.000 1990  0+00:30:41

Question 2 - Files are written as nobody:nouser, how can we change this? 
Problem here is that the written files are unreadable/unwriteable to the submitter 

Tried this but did not work 

Thanks, Lyle

--
AE CAPITAL
Ground Floor, 555 Bourke Street, Melbourne Australia 3000

p +61 3 9020 7801
m +61 (0)434 872 054
w http://www.aecapital.com.au


AE Capital Pty Limited (ACN 153 242 865) is regulated by the Australian Securities & Investments Commission and is a Corporate Authorised Representative of JFM Pty Limited (ACN 125 150 656), holder of an Australian Financial Services Licence (AFSL 314585).  AE Capital Pty Limited is a member of the National Futures Association (ID 0498660).