[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor Cluster within Slurm Job



Hi,

we at KIT (WLCG Tier1 and university Tier3) developed COBalD/TARDIS [0] to integrate resources from HPCs, Clouds, etc. into an overlaying HTCondor pool from various providers [1]. We support a variety of backends, Probably most important for you, we use the Slurm backend at scale (up to 12500 cores) in production for a while now.

If you have any questions, just let me know.

Best regards,
Manuel

[0]https://cobald-tardis.readthedocs.io/en/latest/
[1]https://www.epj-conferences.org/articles/epjconf/abs/2020/21/epjconf_chep2020_07038/epjconf_chep2020_07038.html

Dr. Manuel Giffels, Karlsruhe Institute of Technology (KIT), Steinbuch Centre for Computing (SCC)
Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen
Phone: +49 721 608 28636, Email: Manuel.Giffels@xxxxxxx

> Am 29.09.2021 um 21:49 schrieb Leslie Hart - NOAA Federal via HTCondor-users <htcondor-users@xxxxxxxxxxx>:
> 
> Hi,
> 
> Is it possible (and is there an existing recipe) to start up a "private" HTCondor Cluster within a Slurm job. We have users who would like to allocate a number of nodes and then use those nodes as an HT cluster for the duration of the job. Ideally, we could supply a few commands that they would use at the beginning and end of their Slurm batch job to start and shutdown the cluster (the middle would be comprised of a series of HTCondor jobs, of course. e.g. HTCondorStart (would figure out the nodes that Slurm has allocated and create the cluster). HTCondorWait (would wait until all HTCondor jobs completer) and HTCondorFinish (would gracefully shut down HTCondor).
> 
> Thanks,
> Leslie Hart
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/

Attachment: smime.p7s
Description: S/MIME cryptographic signature