[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] K8s usage in the HTCondor community



Hi Matt, Adam et al,

Itâs been a while since I had contact with the cloud community to see how things have moved on, so take my input with a pinch of skepticism.

The three big differences between clouds and clusters, that influence what kind of questions youâd be asking your scheduler:

1. Clouds = long instantiation times ; clusters = âjobsâ that have some finite run time after which they are âdoneâ
2. Clouds = over provisioned ; clusters = completely full
3. Clouds = jobs that are often doing nothing (web servers) ; clusters = jobs using 100% CPU

Of course these are generalisations, so please refrain from educating me about exceptions and edge cases; if one of the statements above is wrong at the 70% level, I certainly would like to hear about it.

My colleague Oxana Smirnova once said during a talk, âScheduling is only interesting when the system is fullâ - a brilliant one liner.  Whenever point 2 above holds, you immediately have a huge reason why cloud âschedulingâ is way different than cluster scheduling. 

HTH

JT


> On 5 Nov 2023, at 09:13, Bockelman, Brian <BBockelman@xxxxxxxxxxxxx> wrote:
> 
> Hello Matt, Adam,
> 
> We definitely use Kubernetes these days!  For the PATh project (https://path-cc.io/), nearly all of our central services live inside Kubernetes.
> 
> A few use cases I've seen that mix Kubernetes and HTCondor:
> 1.  Running the HTCondor central manager inside Kubernetes.  It's a simple, relatively static service - perhaps no interesting items there.
>   - You asked about stateless: we often forget, but there is state in the central manager!  It's just fairly minimal.
> 2.  Running pods as backfill.  Put a HTCondor EP (execution point, aka worker node) inside a container and run it as a pod as part of a larger deployment.  When there are higher priority pods to execute, the HTCondor EP is preempted by Kubernetes.  Again, a pretty simple scheduling case.
> 3.  Auto-scaling HTCondor EPs when there is work to be done (see https://github.com/opensciencegrid/htcondor-autoscale-manager).  This is done on the "PATh Facility" so the hosts can be used when otherwise idle.  A Prometheus metric determines how many additional pods are needed, allowing the HPA to do its job.
>   - This relies on the HTCondor "rooster" mechanism where the negotiator can annotate a ClassAd representing an offline slot as having matching jobs.  This is taken into account in a prometheus metric, triggering the HPA scale-up.
>   - Feedback: the scale-down mechanism of the HPA leave quite a bit to be desired.  The EP knows when it is idle, making it a great target for scale-down or preemptively scaling down.  We solve this in the htcondor-autoscale-manager by annotating the pod iwth a preempt priority; however, it feels quite brittle to me.
> 4.  The NRP team has a really cool project where they submit HTCondor EPs as Kubernetes jobs.  When they're idle, the jobs finish, solving the scale-down issue nicely (though there's more work in doing the scale-up!).
> 
> For scheduling in general, I think the an interesting difference is the focus on multi-tenant scheduling in the face of scarcity; for example, if the cluster is fixed-size and always oversubscribed, how do you make resource allocation decisions?
> 
> Hope this helps,
> 
> Brian
> 
> PS -- I don't think of there as being friction between "cloud" and "batch" view of scheduling but a wonderful diversity of approaches and design priorities!
> 
>> On Nov 3, 2023, at 4:51 PM, Matthew T West via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
>> 
>> Good Friday afternoon,
>> 
>> Because I like introducing CNCF folks to this community, Adam McArthur is an employee in G-Research's OSS team <https://opensource.gresearch.com/>. He is trying to understand how projects using HTCondor, amongst other traditional batch schedulers, leverage Kubernetes to deploy containers/pods for either compute hosts or services.
>> 
>> Of particular interest is whether k8s is still being used it it's traditional stateless manner and if not, why?
>> 
>> From my interactions with folks in the CNCF Batch (compute) Working Group, there seems to be some friction between how cloudy folks envision "scheduling" and what we view it as. Each side seems skeptical of the other's design philosophy and there is a bit of cross talk going on.
>> 
>> IIRC, the Path Facilities use Kubernetes to manage/deploy their local compute resources, correct? If anyone else uses k8s for container deployment of HTCondor daemons or for other production services, we'd love to hear more about it.
>> 
>> Cheers,
>> Matt
>> 
>> P.S. - Any faults in the descriptions of either k8s or htcondor deployments are purely my own.
>> 
>> -- 
>> Matthew T. West
>> DevOps & HPC SysAdmin
>> University of Exeter, Research IT
>> www.exeter.ac.uk/research/researchcomputing/support/researchit
>> 57 Laver Building, North Park Road, Exeter, EX4 4QE, United Kingdom
>> 
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>> 
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
> 
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/