[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Parallel universe as the only universe.



Hi Amy,

While not specifically on the Parallel Universe, Stefano Dal Pra gave a good talk <https://indico.cern.ch/event/1059494/contributions/4532532/> last year at the Autumn HTCondor week on managing mixed workloads (single core & multi-core[single-node]) in the same system. There might be other materials lurking in the <https://research.cs.wisc.edu/htcondor/past_condor_weeks.html> archives.

On the specific case, I think he just needs to learn how to share. Compute power isn't infinite and other people are using the machines as well. Hopefully he is checkpointing his jobs so whenever they are pre-empted, the work isn't lost. 

What do you, as a UTexas sys-admin, want this particular machine's focus to be: single-core throughput, shared-memory HPC, something for everyone? Do you have any feedback from other users beyond this particularly demanding one?

Cheers,
Matt West

-----Original Message-----
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Amy Bush
Sent: 29 March 2022 08:57 PM
To: htcondor-users@xxxxxxxxxxx
Subject: [HTCondor-users] Parallel universe as the only universe.

CAUTION: This email originated from outside of the organisation. Do not click links or open attachments unless you recognise the sender and know the content is safe.


We have only recently dipped a toe into a parallel universe configuration, and only have a few machines dedicated to this universe.
Recently a new grad student entered the chat, and he would like All The Machines dedicated to parallel universe, because it's the universe he cares about, so now it is the the most important. Even after adding more compute nodes to the parallel universe, he still isn't happy because some of his jobs are being preempted by other users with better priority.

He suggested that everything just be changed to parallel universe, and then people can request a single cpu if their job is not parallel.

I can't find any instances via google telling me anyone has done that, or why it's a good or bad idea.

Can anyone here opine?

Thanks!

--
amy
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.cs.wisc.edu%2Fmailman%2Flistinfo%2Fhtcondor-users&amp;data=04%7C01%7CM.T.West%40exeter.ac.uk%7C0d855efd83f6423b0eb108da11bede37%7C912a5d77fb984eeeaf321334d8f04a53%7C0%7C0%7C637841808725760894%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=4JACe2r%2FdEDmkRVDbcGyibXRbg0B896LERNbWwYOHP0%3D&amp;reserved=0

The archives can be found at:
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.cs.wisc.edu%2Farchive%2Fhtcondor-users%2F&amp;data=04%7C01%7CM.T.West%40exeter.ac.uk%7C0d855efd83f6423b0eb108da11bede37%7C912a5d77fb984eeeaf321334d8f04a53%7C0%7C0%7C637841808725760894%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=7WKgkrPNJXbLSwOClWNF8io7SSs5ZunurYqZoupvrag%3D&amp;reserved=0