[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] PERMISSION DENIED to unknown user from host and connection to astro.cs.wisc.edu



Hello,
This message was initially sent to condor-admin@xxxxxxxxxxx to on october 17th but appart from the original reply there as been no further communication on this topic. Thus I repost it in the mailing list, sorry for the inconvenience. Note I've verified the logs today, what I described in the original email still occurs.

1) We've recently setup Condor 6.8.1 on a pool a Linux and Windows computers. The master controler is on a Linux computer. 
Everything seems to work OK, we are able to get the pool status, to submit jobs and to get the results back at the end of their runs. 

But when we looked at the CollectorLog file on the master controler we noticed a bunch of error messages that happen often :

10/18 09:02:03 DaemonCore: PERMISSION DENIED to unknown user from host <x.x.x.x:portq> for command 10 (QUERY_STARTD_PVT_ADS)
10/18 09:02:03 DaemonCore: PERMISSION DENIED to unknown user from host <x.x.x.x:portb> for command 49 (UPDATE_NEGOTIATOR_AD)

The ip listed in the error can be of any condor computer that has been added to the pool.
After searching for some documentation on Google and the condor mailing list we tried to switch HOSTALLOW_READ and HOSTALLOW_WRITE :

- to numeric ip range instead of named ip range

- to a single *

This does not seem to change anything.

We also tried to setup the master controller on one of the Windows computers with the same results in the master controller's log files.

2) Additionally it seems from time to time the master controller is trying to connect to 128.105.143.14 which seems to be one of your computers.

10/18 08:56:43 attempt to connect to <128.105.143.14:9618> failed: timed out after 20 seconds.
10/18 08:56:43 ERROR: SECMAN:2003:TCP auth connection to <128.105.143.14:9618> failed
10/18 08:56:43 Can't send UPDATE_COLLECTOR_AD to collector (<128.105.143.14:9618>): Failed to send UDP update command to collector

>host 128.105.143.14
14.143.105.128.in-addr.arpa domain name pointer astro.cs.wisc.edu.

3) Futher more (not mentionned in the original mail) it seems that at some point the master controller daemon tries to send emails to the university of wisconsin that get bounced back. The original mail sent by the master controller contains a montly report of connected hosts similar to what you can get with condor_status. I've verified in the configuration file and the local configuration file that CONDOR_ADMIN is properly defined to a local mail address. Is there any way to disable that?

    **      THIS IS A WARNING MESSAGE ONLY      **
    **  YOU DO NOT NEED TO RESEND YOUR MESSAGE  **
    **********************************************

The original message was received at Wed, 8 Nov 2006 00:07:37 +1100 from localhost.localdomain [x.x.x.x]

   ----- Transcript of session follows -----
451 4.4.1 reply: read error from shale.cs.wisc.edu.
451 4.4.1 reply: read error from limestone.cs.wisc.edu.
451 4.4.1 reply: read error from silica.cs.wisc.edu.
451 4.4.1 reply: read error from obsidian.cs.wisc.edu.
451 4.4.1 reply: read error from granite.cs.wisc.edu.
451 4.4.1 reply: read error from basalt.cs.wisc.edu.
451 4.4.1 reply: read error from lava.cs.wisc.edu.
451 4.4.1 reply: read error from gypsum.cs.wisc.edu.
451 4.4.1 reply: read error from redherring.cs.wisc.edu.
<condor-admin@xxxxxxxxxxx>... Deferred: Connection reset by redherring.cs.wisc.edu.
Warning: message still undelivered after 4 hours Will keep trying until message is 5 days old

Thanks for your help.

---- 
Fabrice Bouyé (http://fabricebouye.cv.fm/) 
Research Officer (Data) 
Tel: +687 26 20 00 (Ext 411) 
Oceanic Fisheries, Pacific Community 
http://www.spc.int/