[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor Quill Problem



I think you just need one DBMSD running in one pool. Try to remove it from DAEMON_LIST on serv07 and restart Condor services.

 

Thanks,

Wancheng  

 

 


From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Santanu Das
Sent: Friday, January 07, 2011 8:52 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Condor Quill Problem

 

Hi Guys,

Sorry for bringing in this old post again: unfortunately I'm seeing the same problem.

The Central Manager and the Submit Host are two different machine (vserv03 and serv07, respectively) and I'm running Quill daemon on both.

Central Manager (vserv03):
DAEMON_LIST                     = MASTER, COLLECTOR, NEGOTIATOR, DBMSD, QUIL
Submit Host (serv07)
DAEMON_LIST                     = MASTER, SCHEDD, DBMSD, QUILL


The postgres database is running on the Central Manager (vserv03) with necessary permission for the Submit Host to access the database. I can run condor_q (or  condor_q -direct schedd) without any problem but "quilld"returns exactly the same error:

[root@serv07 ~]# condor_q -direct quilld
 
-- Failed to fetch ads from: <172.24.116.185:9915> : serv07.hep.phy.cam.ac.uk
 
-- Quill daemon at quill@xxxxxxxxxxxxxxxxxxxxxxxx(<172.24.116.185:9915>)
        associated with schedd serv07.hep.phy.cam.ac.uk(<172.24.116.185:9744>)
        is not reachable or can't talk to rdbms.
        - Not failing over due to -direct
 
 

I didn't able to figure out what I'm doing wrong. Any one out here with some info for me please?
As Steve suggested, here is output:

[root@serv07 ~]# condor_config_val -dump | grep QUILL
DAEMON_LIST = MASTER, SCHEDD, DBMSD, QUILL
QUILL = $(SBIN)/condor_quill
QUILL_ADDRESS_FILE = $(LOG)/.quill_address
QUILL_DB_IP_ADDR = vserv03:5432
QUILL_DB_NAME = quill_vserv03
QUILL_DB_QUERY_PASSWORD = reader
QUILL_DB_TYPE = PGSQL
QUILL_DB_USER = quillwriter
QUILL_DBSIZE_LIMIT = 20
QUILL_ENABLED = TRUE
QUILL_HISTORY_DURATION = 30
QUILL_IS_REMOTELY_QUERYABLE = TRUE
QUILL_JOB_HISTORY_DURATION = 3650
QUILL_LOG = $(LOG)/QuillLog
QUILL_MAINTAIN_DB_CONN = TRUE
QUILL_MANAGE_VACUUM = FALSE
QUILL_NAME = quill@$(FULL_HOSTNAME)
QUILL_POLLING_PERIOD = 10
QUILL_RESOURCE_HISTORY_DURATION = 7
QUILL_RUN_HISTORY_DURATION = 7
QUILL_USE_SQL_LOG = TRUE
 

Thanks in advance; looking forward to hearing from someone.

Cheers,
Santanu



On 05/01/2009 22:11, Steven Timm wrote:

Send the output of
 
condor_config_val -dump | grep QUILL
 
also, do you have your .pgpass file stored in the $(SPOOL) directory
on the machine that runs the schedd?
 
Is postgres up and running?   the error message indicates
that it is probably not, or else you can't talk to it due
to the missing .pgpass file.
 
What happens with just plain condor_q?
 
Steve timm
 
 
 
On Mon, 5 Jan 2009, Natarajan, Senthil wrote:
 
Hi,
I am trying to set up Quill (Condor 7.0.5, PostgreSQL 8.3.5)
condor_q and condor_history  informations are not saving in the database.
 
When I try
condor_q -direct quilld
 
-- Failed to fetch ads from: <xxx.xx.xxx.xx:36983> : xxxxxxxx
 
-- Quill daemon at quill@xxxxxx(<xxx.xx.xxx.xx:36983>)
       associated with schedd xxxxxx(<xxx.xx.xxx.xx:36981>)
       is not reachable or can't talk to rdbms.
       - Not failing over due to -direct
 
There is no error msg in QuillLog and DbmsdLog.
 
Could you please let me know what might be the problem.
 
Thanks,
Senthil