[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Efficiency & centralization of global information gathering?



Thanks for the recommendation!

Is it feasible to have something just condor_advertise directly into the startd ads of each machine, or is there some permissions or scalability caveat I'm missing there? My primary concern is the impact on the source of the data, not so much collector overhead.  It seems like if you'd be publishing custom classads then it wouldn't be a far leap, but I suspect I'm probably missing something given how little I know about condor_advertise thus far.

I assume that if you published the data in question into a generic ad, you'd use condor_status instead of "cat" to pull it for the startd_cron, correct?

	-Michael Pelletier.

-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Fischer, Max (SCC)
Sent: Wednesday, January 04, 2017 1:59 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Efficiency & centralization of global information gathering?

Hi Michael,

I've found this to be best solved outside of Condor.

1. Have a regular cron job *somewhere* fetch the data once.
2. Provide that data via files on shared filesystems.
3. Have startd_cron read from the file.
4. ???
5. Profit

The trick is just to have 1. and 3. separate. There's no problem having 1. create proper classad already, and 3. just using cat.

Note that using files for 2. is historical laziness on my part:
You can just as well publish this information via custom ClassAds. I think condor_advertise with UPDATE_AD_GENERIC should do the trick.

Cheers,
Max

> Am 04.01.2017 um 18:48 schrieb Michael Pelletier <Michael.V.Pelletier@xxxxxxxxxxxx>:
> 
> Hi folks,
>  
> Iâm wondering if thereâs some sort of trick I can use to provide machine attributes via some other mechanism than the startd. For example, one of the constraints on certain jobs might be the amount of disk space available on the output NFS filesystem â that is, I donât want to start a job unless thereâs at least a certain amount of disk space available on the that filesystem. Similarly, monitoring a FlexLM license server to maintain an attribute for license counts would be useful if itâs not feasible to dedicate a fixed set of licenses to a concurrency limit.
>  
> The obvious way to do this is a startd_cron job to check the available space on that filesystem, but the trouble is that this doesnât scale well â each machine winds up making its own query for a value which will be identical across all machines in the pool at any given time. While this is not a particular concern for a value such as this which is relatively infrequently queried and changes slowly, the scaling problem gets larger when you want to have a smaller query interval for a more dynamic value.
>  
> The alternative would be a schedd_cron job, since that would only run on the scheduler, but then the question is how to get the attributes it generates into a position where they can be evaluated for matching. Perhaps doing something with condor_advertise in a startd_cron job would be the right approach? Or perhaps thereâs something in the Python bindings that could handle this from a central point?
>  
> Thanks for any suggestions.
>  
> Michael V. Pelletier
> Principal Engineer
> Information Technology
> Program Support & Delivery
> Integrated Defense Systems
> Raytheon Company
> 
> +1 978-858-9681   (office)
> +1 339-293-9149   (cell)
> 7-225-9681   (tie line)
> Michael.V.Pelletier@xxxxxxxxxxxx
> 
> 50 Apple Hill Drive
> Tewksbury, MA 01876 USA
> www.raytheon.com
> 
> Follow Raytheon On
> <image001.png> <image002.jpg> <image003.png> <image004.jpg> 
> 
> <image005.gif>
> 
> This message contains information that may be confidential and privileged. Unless you are the addressee (or authorized to receive mail for the addressee), you should not use, copy or disclose to anyone this message or any information contained in this message. If you have received this message in error, please so advise the sender by reply e-mail and delete this message. Thank you for your cooperation.
>  
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/