[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Configuring a CE/Schedd



Hi Iain,

>From the message:

> 03/24/15 19:13:14 DC_AUTHENTICATE: required authentication of 128.142.132.67 failed: AUTHENTICATE:1002:Failure performing handshake|AUTHENTICATE:1004:Failed to authenticate using FS|FS:1004:Unable to lstat(/tmp/FS_XXXWRRJqi)|AUTHENTICATE:1004:Failed to authenticate using FS|AUTHENTICATE:1004:Failed to authenticate using KERBEROS|AUTHENTICATE:1004:Failed to authenticate using GSI|GSI:5002:Failed to authenticate because the remote (client) side was not able to acquire its credentials.

The important part of the message is:

"the remote (client) side was not able to acquire its credentials”

This indicates that the schedd isn’t using its certificate (or isn’t configured with one).

Is this from a shadow log?  If so, you don't want to be using any of these methods - you should be using match auth for your setup.  Perhaps that's something which got lost in the merge?

Brian

> On Mar 24, 2015, at 1:48 PM, Iain Bradford Steers <iain.steers@xxxxxxx> wrote:
> 
> Hi,
> 
> I'm in the process of finalizing our CE/Schedd setup for our pool, we're using Puppet.
> 
> I had the CE working and acting as a scheduler with a manual config and decided to move it to the HEP-Puppet/htcondor module.
> 
> This is the output I get in SchedLog(*), I've removed the ip but it's the machine's own ip in all instances.
> 
> After this it just proceeds to spam condor_write errors until it fills the log file and starts a new one.
> 
> The ce is in the certificate mapfile along with all the other hosts and apart from the ordering of hostnames a vimdiff shows no difference between the security config file for this and the one that the central manager uses.
> 
> Has anyone else experienced this issue?
> 
> Thanks, Iain
> 
> (*)
> 03/24/15 19:12:28 Address rewriting: Warning: attribute 'ScheddIpAddr' <MACHINE_IP:9618?noUDP&sock=17305_aee5_3> == <MACHINE_IP:9618?noUDP&sock=17305_aee5_3>, but old logic couldn't find the command port for outbound interface MACHINE_IP.
> 03/24/15 19:12:28 Address rewriting: Warning: attribute 'ScheddIpAddr' address in ad (<MACHINE_IP:9618?noUDP&sock=17305_aee5_3>) == command socket (<MACHINE_IP:9618?noUDP&sock=17305_aee5_3>), but old logic couldn't find that command socket in its list.
> 03/24/15 19:12:28 Address rewriting: Warning: attribute 'MyAddress' <MACHINE_IP:9618?noUDP&sock=17305_aee5_3> == <MACHINE_IP:9618?noUDP&sock=17305_aee5_3>, but old logic couldn't find the command port for outbound interface MACHINE_IP.
> 03/24/15 19:12:28 Address rewriting: Warning: attribute 'MyAddress' address in ad (<MACHINE_IP:9618?noUDP&sock=17305_aee5_3>) == command socket (<MACHINE_IP:9618?noUDP&sock=17305_aee5_3>), but old logic couldn't find that command socket in its list.
> 03/24/15 19:12:33 -------- Begin starting jobs --------
> 03/24/15 19:12:33 -------- Done starting jobs --------
> 03/24/15 19:13:14 Received a superuser command
> 03/24/15 19:13:14 This process has a valid certificate & key
> 03/24/15 19:13:14 Failed to read end of message from <MACHINE_IP:34711>; 1280 untouched bytes.
> 03/24/15 19:13:14 condor_write(): Socket closed when trying to write 13 bytes to <MACHINE_IP:34711>, fd is 15, errno=104 Connection reset by peer
> 03/24/15 19:13:14 Buf::write(): condor_write() failed
> 03/24/15 19:13:14 condor_read(): Socket closed when trying to read 5 bytes from <MACHINE_IP:34711> in non-blocking mode
> 03/24/15 19:13:14 IO: EOF reading packet header
> 03/24/15 19:13:14 condor_read(): Socket closed when trying to read 5 bytes from <MACHINE_IP:34711>
> 03/24/15 19:13:14 IO: EOF reading packet header
> 03/24/15 19:13:14 AUTHENTICATE: handshake failed!
> 03/24/15 19:13:14 DC_AUTHENTICATE: required authentication of 128.142.132.67 failed: AUTHENTICATE:1002:Failure performing handshake|AUTHENTICATE:1004:Failed to authenticate using FS|FS:1004:Unable to lstat(/tmp/FS_XXXWRRJqi)|AUTHENTICATE:1004:Failed to authenticate using FS|AUTHENTICATE:1004:Failed to authenticate using KERBEROS|AUTHENTICATE:1004:Failed to authenticate using GSI|GSI:5002:Failed to authenticate because the remote (client) side was not able to acquire its credentials.
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/