[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] collector hierarchy burst



This is a testbed cluster, which we use for stresstest HTCondor,
trying to find its limits. For this, we use a custom stresstesting
framework, and for interacting with Condor we use the standard Condor
tools, ssh and the Python API provided by Condor. If we decide to go
with Condor, the production cluster forseenably will be even bigger
with VM machines, 1 slot / machine as the most common use-case, which
means ~100 000 machines.

2014-04-06 16:18 GMT+02:00 Keith Brown <keith6014@xxxxxxxxx>:
> impressive.
>
> what tools do you use to manage such a large setup.
>
>
>
> On Sun, Apr 6, 2014 at 10:13 AM, Pek Daniel <pekdaniel@xxxxxxxxx> wrote:
>>
>> Well, in our case it's more like ~3000 machines with ~100 000 jobslots.
>>
>> 2014-04-06 16:06 GMT+02:00 Keith Brown <keith6014@xxxxxxxxx>:
>> > 50 subcollectors :-)
>> >
>> > I am running a pool of 600 machines with thousands of ClassAds updating
>> > every 10 secs all with one collector (7.4).
>> >
>> >
>> >
>> >
>> > On Wed, Mar 19, 2014 at 11:57 AM, Pek Daniel <pekdaniel@xxxxxxxxx>
>> > wrote:
>> >>
>> >> Thanks for the hint!
>> >>
>> >> By the way, as far as I know in case of condor_startd there's an
>> >> option for this purpose:
>> >>
>> >>
>> >> http://research.cs.wisc.edu/htcondor/manual/v8.1/3_3Configuration.html#param:UpdateOffset
>> >>
>> >> But I suppose currently there's no counterpart of this option for the
>> >> collectors.
>> >>
>> >> 2014-03-18 14:43 GMT+01:00 Brian Bockelman <bbockelm@xxxxxxxxxxx>:
>> >> > Hi Daniel,
>> >> >
>> >> > Probably not (but I didn't look at the code so I'm not sure).
>> >> >
>> >> > You can set the update interval for each to include a random element
>> >> > with the special $RANDOM_INTEGER() config macro.  See
>> >> >
>> >> > http://research.cs.wisc.edu/htcondor/manual/v8.1/3_3Configuration.html#SECTION00432000000000000000.
>> >> >
>> >> > I opened a ticket to investigate this issue:
>> >> > https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=4264
>> >> >
>> >> > Brian
>> >> >
>> >> > On Mar 14, 2014, at 5:23 AM, Pek Daniel <pekdaniel@xxxxxxxxx> wrote:
>> >> >
>> >> >> Hi,
>> >> >>
>> >> >> I've set up a collector hierarchy on two machines with 50
>> >> >> subcollector
>> >> >> / machine, and a main collector. My question is that if I start up
>> >> >> the
>> >> >> collector daemons more-or-less at the same time (DAEMON_LIST =
>> >> >> $(DAEMON_LIST) COLLECTOR2 COLLECTOR3 ...), are the subcollectors
>> >> >> clever enough not to send their periodic ClassAd updates at the very
>> >> >> same time to the main collector causing a burst?
>> >> >>
>> >> >> Thanks,
>> >> >> Daniel
>> >> >> _______________________________________________
>> >> >> HTCondor-users mailing list
>> >> >> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
>> >> >> with a
>> >> >> subject: Unsubscribe
>> >> >> You can also unsubscribe by visiting
>> >> >> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>> >> >>
>> >> >> The archives can be found at:
>> >> >> https://lists.cs.wisc.edu/archive/htcondor-users/
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > HTCondor-users mailing list
>> >> > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
>> >> > with a
>> >> > subject: Unsubscribe
>> >> > You can also unsubscribe by visiting
>> >> > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>> >> >
>> >> > The archives can be found at:
>> >> > https://lists.cs.wisc.edu/archive/htcondor-users/
>> >> _______________________________________________
>> >> HTCondor-users mailing list
>> >> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
>> >> with
>> >> a
>> >> subject: Unsubscribe
>> >> You can also unsubscribe by visiting
>> >> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>> >>
>> >> The archives can be found at:
>> >> https://lists.cs.wisc.edu/archive/htcondor-users/
>> >
>> >
>> >
>> > _______________________________________________
>> > HTCondor-users mailing list
>> > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
>> > with a
>> > subject: Unsubscribe
>> > You can also unsubscribe by visiting
>> > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>> >
>> > The archives can be found at:
>> > https://lists.cs.wisc.edu/archive/htcondor-users/
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
>> a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
>
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/