[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor Quill Problem - database not reachable



Does any one have any idea at all about what's going on? Condor Quill is keep crashing and most of the time it's unreachable.
Comments? Advice treasured!



On 22/02/2011 10:27, Santanu Das wrote:
I should mention that, in the MasterLog on the vserv03, there a number of these:

02/22 10:10:13 Started DaemonCore process "/opt/condor/sbin/condor_dbmsd", pid and pgroup = 13940 02/22 10:10:13 The DBMSD (pid 13940) died due to signal 11 (Segmentation fault) 02/22 10:10:13 restarting /opt/condor/sbin/condor_dbmsd in 3600 seconds

and the DbmsdLog reports these:

   02/22 10:10:13 main_init() called
   02/22 10:10:13 Using Database Type = Postgres
   02/22 10:10:13 Using Database IpAddress = vserv03:5432
   02/22 10:10:13 Using Database Name = quill_vserv03
   02/22 10:10:13 Using Database User = quillwriter
   02/22 10:10:13 Connection to database 'quill_vserv03' failed.
   02/22 10:10:13 FATAL:  connection limit exceeded for non-superusers
02/22 10:10:13 Deallocating connection resources to database 'quill_vserv03' 02/22 10:10:13 config: unable to connect to DB--- ERROR02/22 10:10:13 ERROR "config: unable to connect to DB
   " at line 133 in file ManagedDatabase.cpp
   Stack dump for process 13940 at timestamp 1298369413 (14 frames)
   condor_dbmsd(dprintf_dump_stack+0xb7)[0x5183f0]
   condor_dbmsd(_Z18linux_sig_coredumpi+0x2c)[0x50afc8]
   /lib64/libpthread.so.0[0x2b1dcd4abb10]
   condor_dbmsd(_ZN11DBMSManagerD1Ev+0xbd)[0x4dce1d]
   condor_dbmsd[0x4dbf16]
   /lib64/libc.so.6(exit+0xe5)[0x2b1dce4cf3a5]
   condor_dbmsd(__wrap_exit+0x28)[0x4f3330]
   condor_dbmsd[0x516911]
   condor_dbmsd(_ZN15ManagedDatabaseC1Ev+0x421)[0x4ddf35]
   condor_dbmsd(_ZN11DBMSManager4initEv+0x63)[0x4dc925]
   condor_dbmsd(_Z9main_initiPPc+0x2d)[0x4dbfe7]
   condor_dbmsd(main+0x18df)[0x50d26b]
   /lib64/libc.so.6(__libc_start_main+0xf4)[0x2b1dce4b9994]
   condor_dbmsd(__gxx_personality_v0+0x411)[0x4dbde9]


Cheers,
Santanu

Santanu Das wrote:
Dear all,

Every time I try to use condor_history,  I get this:

-- Quill: quill@xxxxxxxxxxxxxxxxxxxxxxxx : <vserv03:5432> : quill_vserv03
   -- Database at <vserv03:5432> not reachable
--Failing over to the history file at /home/condorr/spool/history instead --


Or condor_q, returns this:

-- Failed to fetch ads from db [quill_vserv03] at database server <vserv03:5432>
   -- Database not reachable or down.
           - Failing over to the quill daemon --

On the box, where QUILL database is running (vserv03), I see these in the log:

02/22 09:48:36 *** Warning: Bad Log file; skipping malformed Attr List
   02/22 09:48:36 >>>>>>>> Fail: Polling Event Log <<<<<<<<
   02/22 09:48:36 ******** Start of Polling XML Log ********
   02/22 09:48:36 ********* End of Polling XML Log *********
   02/22 09:48:36 ++++++++ Sending Quill ad to collector ++++++++
   02/22 09:48:36 ++++++++ Sent Quill ad to collector ++++++++
   02/22 09:48:36 ******** Start of Polling Job Queue Log ********
   02/22 09:48:36 JOB QUEUE POLLING RESULT: NO CHANGE
   02/22 09:48:36 ********* End of Polling Job Queue Log *********
   02/22 09:48:36 ******** Start of Polling Event Log ********
02/22 09:48:55 failed to create classad; bad expr = username = "group_camont.camoNEW Rejects

    Any idea about what's going wrong or where I start digging in?

Cheers,
Santanu



_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/