[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Hold reason: Error from slot1@... Failed to open /var/lib/condor/spool...



Hi ,

My current htcondor configuration is limited to a local submit host, anyway thank you for these security recommendations.

And what about the 'Hold reason' that I'm experiencing, any hints on how to solve that?

Thanks,

--
C. Adean


From: "Brian Lin" <blin@xxxxxxxxxxx>
To: "HTCondor-Users Mail List" <htcondor-users@xxxxxxxxxxx>, "Carlos Adean" <carlosadean@xxxxxxxxxxxx>
Sent: Monday, January 27, 2020 3:53:37 PM
Subject: Re: [HTCondor-users] Hold reason: Error from slot1@... Failed to open /var/lib/condor/spool...

Hi Carlos,

For remote job submission, you're restricted to GSI, SSL, or Kerberos authentication. If you don't have the infrastructure to support any of the aforementioned methods, you should consider just supporting job submission from a local submit host.

I would warn against using CLAIMTOBE on any host in your pool as it's  effectively the same as no security, e.g. I could pose as you for job submission or even have my own host join your pool! I highly recommend reading through the security portion of the HTCondor manual, particularly the authentication section: https://htcondor.readthedocs.io/en/stable/admin-manual/security.html#authentication for more details.

- Brian

On 1/27/20 12:32 PM, Carlos Adean wrote:
Hello htondor list,

I'm trying to submit a simple job using the remote option but I received this error message.

$ condor_submit -remote loginicx.ib0.cm.linea.gov.br vj.sub
Submitting job(s)
ERROR: Failed to connect to queue manager loginicx.ib0.cm.linea.gov.br
AUTHENTICATE:1003:Failed to authenticate with any method
AUTHENTICATE:1004:Failed to authenticate using GSI
GSI:5003:Failed to authenticate.  Globus is reporting error (851968:38).  There is probably a problem with your credentials.  (Did you run grid-proxy-init?)
AUTHENTICATE:1004:Failed to authenticate using KERBEROS
AUTHENTICATE:1004:Failed to authenticate using FS

Looking for a solution I found this thread https://www-auth.cs.wisc.edu/lists/htcondor-users/2016-July/msg00060.shtml  then I set the parameters below and now I can submit a job.

# in the server
SEC_DEFAULT_AUTHENTICATION_METHODS = FS, CLAIMTOBE, $(SEC_DEFAULT_AUTHENTICATION_METHODS)

# in the client
SEC_CLIENT_AUTHENTICATION_METHODS = FS, CLAIMTOBE, $(SEC_CLIENT_AUTHENTICATION_METHODS)


However as you can see htcondor put the job in a hold state. If I use the same submit file from the submitter host it runs without errors.

---
47128.000:  Job is held.
Hold reason: Error from slot1@xxxxxxxxxxxxxxxxxxxxxxxxx: Failed to open '/var/lib/condor/spool/7128/0/cluster47128.proc0.subproc0/_condor_stdout' as standard output: No such file or directory (errno 2)



Any hints on how to solve this problem?


My condor version is
$CondorVersion: 8.8.1 Feb 18 2019 BuildID: 461773 PackageID: 8.8.1-1 $
$CondorPlatform: x86_64_RedHat7 $



Thanks,


--
C. Adean


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/