[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Multiple submit hosts



Thanks for the response.

I don't have a shared filesystem. Is it possible to have multiple
instances of quill running? I already have quill running on my central
manager.
Basically, if my Central manager host gets too busy crashes we still
want to be able to view the jobs from other hosts.

I setup a collector on my other host but when I run condor_q i see no
jobs there. However on my primary CM I see the correct results.


On Tue, Jan 5, 2010 at 9:19 AM, Todd Tannenbaum <tannenba@xxxxxxxxxxx> wrote:
> Mag Gam wrote:
>>
>> I currently have 1 submit host which has the central manager, schedd,
>> collector, and negotiator.
>>
>> I would like to have another submit host which I can view the queue
>> (condor_q), and the status of all boxes (condor_status). If the first
>> server is down, I would still like to monitor the queue with this
>> server. I don't really care about scheduling and submitting more jobs.
>>  Any idea how to do this?
>>
>> I have been looking thru this:
>> http://www.cs.wisc.edu/condor/manual/v6.8/3_10High_Availability.html
>> but it seems like a bit overkill.
>
> You can do condor_status from anywhere, so the only issue is viewing the
> queue on machine X when machine X is dead.
>
> The schedd High Availability mechanism you reference above would certainly
> work.
>
> Considering all you want to do is view the queue on machine X when machine X
> is dead, another idea would be to install Quill -
>   http://www.cs.wisc.edu/condor/manual/v7.4/3_12Quill.html
> The idea here is you'd run PostgreSQL (open source database) on a different
> machine, and configure your schedd on machine X to "echo" all queue
> information into the database.  condor_q can then query either the  schedd
> or the database.  If you already have a reliable shared file system
> available, however, simply following the High Availability section above to
> have schedd failover may be less hassle than setting up PostgreSQL.
>
> Another primitive but simple idea: a script or batch file to periodically
> save the output of condor_q to a file on a shared file system or a web page.
>  You could submit this script as a local universe job to your schedd. :)
>
> Todd
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>