[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Special configuration for Parallel Universe

Thanks Jaime.


That’s what I figured was the case, but I wanted to double check.

Just for context, the question was based on a discussion I had with a bunch of Slurm admins who all build their scheduler from source. This is due to necessary library links (pmix, cuda, slingshot, etc) which could only be done at scheduler compile time. I don’t get why this is necessary but that’s a question for the Slurm mailing list.

Trying to hold these scheduler frameworks in my head at once is a difficult mental juggling act.





From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Jaime Frey
Sent: 10 May 2022 05:15 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Special configuration for Parallel Universe


CAUTION: This email originated from outside of the organisation. Do not click links or open attachments unless you recognise the sender and know the content is safe.


There’s nothing more you need to do compiling or configuring HTCondor. Your users’ submit files need to specify a wrapper script that invokes the MPI executable (i.e. mpirun) with the appropriate arguments. We provide some sample wrapper scripts for common MPI implementations. This is described in the User section of the manual:


 - Jaime

On May 10, 2022, at 5:36 AM, West, Matthew <M.T.West@xxxxxxxxxxxx> wrote:


Good morning build/make experts,


Beyond the details listed in Admin manual for setting up a dedicated scheduler, are there additional measures one needs to do to allow multi-node jobs to run on that specific partition? I don’t have to build HTCondor from source and specify particular MPI/MPICH compiler flags in order for things to work, right?



Matthew West 
Research Software Engineer 
IDSAI, University of Exeter