[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Remote condor_history with multiple Schedd hosts



By the way. this is clearly a bug.   I think the schedd should be passing the localname down to condor_history so that

condor_history ends up reading the configuration in the same way that the schedd does. 

 

This will require changes to condor_history.

 

-tj

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Collin Mehring
Sent: Tuesday, December 17, 2019 2:51 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] Remote condor_history with multiple Schedd hosts

 

Hello Experts,

 

I'm having an issue with condor_history (and the Python binding equivalent) returning blank results for some Schedds. (Version 8.8.5)

 

We have several Schedds in our pool split across two physical hosts. We consider one of these hosts the "primary" as it contains the default Schedd (i.e. DAEMON_LIST contains SCHEDD). We have specified the name and history file path for this Schedd:

 

SCHEDD_NAME = gld-default@

HISTORY = /opt/condor/history/gld-default.history

 

The additional schedds on both hosts follow a similar pattern, for example:

 

SCHEDD_TROLLS20 = $(SCHEDD)
SCHEDD_TROLLS20_ARGS = -f -local-name TROLLS20 -p 8510
TROLLS20.SCHEDD_NAME = trolls2-0@
TROLLS20.HISTORY = /opt/condor/history/trolls2-0.history

<...>

DAEMON_LIST = $(DAEMON_LIST), SCHEDD_TROLLS20

 

All config settings, other than the different schedds, are the same on both hosts.

 

Running 'condor_history -n gld-default@' from a remote host in the pool will return that Schedd's history correctly. Similarly, using -name for any Schedd on the primary host will work as expected. However, specifying the name of a Schedd on the secondary host will return just the header with no results. (e.g. condor_history -n trolls2-0@)

 

The command is reaching the correct Schedd, because it logs the following in response:

 

12/17/19 12:18:15 (pid:275016) invoking /usr/bin/condor_history condor_history -inherit -stream-results -match -1 -scanlimit 10000 -constraint true -attributes Args,Arguments,ClusterId,Cmd,CompletionDate,JobStatus,Owner,ProcId,QDate,RemoteUserCpu,RemoteWallClockTime
12/17/19 12:18:15 (pid:275016) Create_Process: using fast clone() to create child process.

 

The Schedd is also writing the history file correctly, because using condor_history -file instead works. (e.g. condor_history -file /opt/condor/history/trolls2-0.history)

 

Any help is appreciated.

 

Thanks,

Collin

--

Collin Mehring | PE-JoSE - Software Engineer

Image removed by sender.