That’s what I figured was the case, but I wanted to double check.
Just for context, the question was based on a discussion I had with a bunch of Slurm admins who all build their scheduler from source. This is due to necessary library links (pmix, cuda, slingshot, etc) which could only be done at scheduler compile time. I
don’t get why this is necessary but that’s a question for the Slurm mailing list.
Trying to hold these scheduler frameworks in my head at once is a difficult mental juggling act.
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx>
On Behalf Of Jaime Frey
Sent: 10 May 2022 05:15 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Special configuration for Parallel Universe
CAUTION: This email originated from outside of the organisation. Do not click links or open
attachments unless you recognise the sender and know the content is safe.
There’s nothing more you need to do compiling or configuring HTCondor. Your users’ submit files need to specify a wrapper script that invokes the MPI executable (i.e. mpirun) with the appropriate arguments. We provide some sample wrapper
scripts for common MPI implementations. This is described in the User section of the manual:
Good morning build/make experts,
Beyond the details listed in Admin manual for setting up a dedicated scheduler, are there additional measures one needs to do to allow multi-node jobs to run on that specific partition? I don’t have to build HTCondor from source and specify
particular MPI/MPICH compiler flags in order for things to work, right?
Research Software Engineer
IDSAI, University of Exeter