[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor Quill Problem



Two points:

1. You should only run the DBMSD on one machine. The manual says: "One machine should run the condor_dbmsd daemon. On this machine, add it to the DAEMON_LIST configuration variable. All Quill-enabled machines should also run the condor_quill daemon. The machine running the condor_dbmsd daemon can also run a condor_quill daemon."

The manual is probably not clear enough in this, and should say "Only one machine"

2. condor_q -direct quilld isn't supported any more - the Quill Daemon does not directly answer queries. It is pretty easy to argue that it's a bug that it is even accepted on the commandline any longer (However, in weird pool setups with new condor_q's and old (v6.8) condor_quill daemons, it would still work)

(For people reading this thread from the future, it's possible that a future version of Condor will support querying the Quill daemon again, but as far as I know it's not on the Condor team's development roadmap)

-Erik


On Fri, Jan 7, 2011 at 7:52 AM, Santanu Das <santanu@xxxxxxxxxxxxxxxxx> wrote:
Hi Guys,

Sorry for bringing in this old post again: unfortunately I'm seeing the same problem.

The Central Manager and the Submit Host are two different machine (vserv03 and serv07, respectively) and I'm running Quill daemon on both.

Central Manager (vserv03):
DAEMON_LIST			= MASTER, COLLECTOR, NEGOTIATOR, DBMSD, QUIL
Submit Host (serv07)
DAEMON_LIST			= MASTER, SCHEDD, DBMSD, QUILL

The postgres database is running on the Central Manager (vserv03) with necessary permission for the Submit Host to access the database. I can run condor_q (or  condor_q -direct schedd) without any problem but "quilld"returns exactly the same error:

[root@serv07 ~]# condor_q -direct quilld

-- Failed to fetch ads from: <172.24.116.185:9915> : serv07.hep.phy.cam.ac.uk

-- Quill daemon at quill@xxxxxxxxxxxxxxxxxxxxxxxx(<172.24.116.185:9915>)
	associated with schedd serv07.hep.phy.cam.ac.uk(<172.24.116.185:9744>)
	is not reachable or can't talk to rdbms.
	- Not failing over due to -direct


I didn't able to figure out what I'm doing wrong. Any one out here with some info for me please?
As Steve suggested, here is output:
[root@serv07 ~]# condor_config_val -dump | grep QUILL
DAEMON_LIST = MASTER, SCHEDD, DBMSD, QUILL
QUILL = $(SBIN)/condor_quill
QUILL_ADDRESS_FILE = $(LOG)/.quill_address
QUILL_DB_IP_ADDR = vserv03:5432
QUILL_DB_NAME = quill_vserv03
QUILL_DB_QUERY_PASSWORD = reader
QUILL_DB_TYPE = PGSQL
QUILL_DB_USER = quillwriter
QUILL_DBSIZE_LIMIT = 20
QUILL_ENABLED = TRUE
QUILL_HISTORY_DURATION = 30
QUILL_IS_REMOTELY_QUERYABLE = TRUE
QUILL_JOB_HISTORY_DURATION = 3650
QUILL_LOG = $(LOG)/QuillLog
QUILL_MAINTAIN_DB_CONN = TRUE
QUILL_MANAGE_VACUUM = FALSE
QUILL_NAME = quill@$(FULL_HOSTNAME)
QUILL_POLLING_PERIOD = 10
QUILL_RESOURCE_HISTORY_DURATION = 7
QUILL_RUN_HISTORY_DURATION = 7
QUILL_USE_SQL_LOG = TRUE

Thanks in advance; looking forward to hearing from someone.

Cheers,
Santanu



On 05/01/2009 22:11, Steven Timm wrote:
Send the output of

condor_config_val -dump | grep QUILL

also, do you have your .pgpass file stored in the $(SPOOL) directory
on the machine that runs the schedd?

Is postgres up and running?   the error message indicates
that it is probably not, or else you can't talk to it due
to the missing .pgpass file.

What happens with just plain condor_q?

Steve timm



On Mon, 5 Jan 2009, Natarajan, Senthil wrote:

Hi,
I am trying to set up Quill (Condor 7.0.5, PostgreSQL 8.3.5)
condor_q and condor_history  informations are not saving in the database.

When I try
condor_q -direct quilld

-- Failed to fetch ads from: <xxx.xx.xxx.xx:36983> : xxxxxxxx

-- Quill daemon at quill@xxxxxx(<xxx.xx.xxx.xx:36983>)
       associated with schedd xxxxxx(<xxx.xx.xxx.xx:36981>)
       is not reachable or can't talk to rdbms.
       - Not failing over due to -direct

There is no error msg in QuillLog and DbmsdLog.

Could you please let me know what might be the problem.

Thanks,
Senthil


    


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/