[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] "Failed to receive remote ad" runtime error when querying history with the python api



Changing UID_DOMAIN to a common string on the HTCondor submit and compute nodes seems to have fixed the problem. Thanks, John!

For future reference, was this a bug or is setting UID_DOMAN = * not supported in this way?


On Mon, Mar 12, 2018 at 11:14 AM John M Knoeller <johnkn@xxxxxxxxxxx> wrote:

Yes. that seems likely.

Â

Â

From: Biruk Mammo [mailto:birukw@xxxxxxxxxx]
Sent: Monday, March 12, 2018 1:07 PM
To: John M Knoeller <johnkn@xxxxxxxxxxx>
Cc: htcondor-users@xxxxxxxxxxx
Subject: Re: [HTCondor-users] "Failed to receive remote ad" runtime error when querying history with the python api

Â

Aha, thanks John!

Â

I have no map file configured. The scheduler's configuration is as follows:

Â

ALLOW_WRITE = $(ALLOW_WRITE), $(CONDOR_HOST)

CONDOR_HOST = condor-master

DAEMON_LIST = MASTER, SCHEDD

DISCARD_SESSION_KEYRING_ON_STARTUP = False

UID_DOMAIN = *

TRUST_UID_DOMAIN = True

Â

Is the UID_DOMAIN setting the culprit?

Â

On Mon, Mar 12, 2018 at 9:56 AM John M Knoeller <johnkn@xxxxxxxxxxx> wrote:

Yes. the problem is here

myusername@*

Â

The * here should be a domain name. Because it is a * instead, and * is used as a token separator, Âthe remainder isnât being parsed correctly.

(more specifically, there should only be one * between the username and the condor version string)

Â

So, something odd is going on in the SCHEDD when it authenticates. Do you have a map file?

Â

-tj

Â

Â

From: Biruk Mammo [mailto:birukw@xxxxxxxxxx]
Sent: Saturday, March 10, 2018 10:00 PM
To: htcondor-users@xxxxxxxxxxx; John M Knoeller <johnkn@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] "Failed to receive remote ad" runtime error when querying history with the python api

Â

Hi John, hope you had a chance to look at this.

Â

On Wed, Feb 28, 2018 at 1:26 PM Biruk Mammo <birukw@xxxxxxxxxx> wrote:

Here is the full log line:

condor_history: getInheritedSocks from CONDOR_INHERIT is '60562 <10.2.0.9:18316> 1 17*3*15*1*8*51*myusername@**$CondorVersion:_8.7.6_Jan_04_2018_BuildID:_428319_$*0*<10.

2.0.9:25777>*48*2*0*9CEBCCEB79FAB9851039EDEAF169AC16C98AC4C827A7CA5A*0* 0 0'

Â

Â

On Wed, Feb 28, 2018 at 8:58 AM John M Knoeller <johnkn@xxxxxxxxxxx> wrote:

could you please send me the [REDACTED] bit from this ToolLog message?

condor_history: getInheritedSocks from CONDOR_INHERIT is ... [REDACTED]

Â

The error indicates that the actual contents of that is incorrectly formatted.

Â

thanks

-tj

Â

Â

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Biruk Mammo via HTCondor-users
Sent: Tuesday, February 27, 2018 5:11 PM
To: htcondor-users@xxxxxxxxxxx
Cc: Biruk Mammo <birukw@xxxxxxxxxx>
Subject: [HTCondor-users] "Failed to receive remote ad" runtime error when querying history with the python api

Â

Hello HTCondor users,

Â

I get aÂ"Failed to receive remote ad" error when using the Python bindings to query history immediately after submitting a job. Looking into the HTCondor logs, I see the following error in ToolLog:

Â

[Timestamp] condor_history: getInheritedSocks from CONDOR_INHERIT is ... [REDACTED]

[Timestamp] ERROR "Assertion ERROR on (*ptmp == '*')" at line 2244 in file /slots/10/dir_3701941/userdir/.tmplMkQ9O/BUILD/condor-8.7.6/src/condor_io/sock.cpp

Â

I also see a core dump in the log directory.

Â

This error does not occur if I wait a few seconds before invoking schedd.history. Also, there is no error if I run the history query without submitting a job.

Â

Below is the Python code that triggers the problem.

Â

import htcondor

submit = htcondor.Submit({'executable': '/usr/bin/sleep', 'arguments': '300'})

schedd = htcondor.Schedd()

with schedd.transaction() as txn:

 print submit.queue(txn)

print list(schedd.history('true', ['ClusterId'], 10))

# RuntimeError: Failed to receive remote ad.

Â

Is there something I am missing? Thanks in advance for your help!

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/