[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Choosing suitable universes for jobs



Hi Dmitry - 

I'm slightly confused about the use case scenario?  Within condor you can enable cgroups to enforce memory boundaries and there are very simple ways  using vanilla universe (I would have to check all the universes supported, but brianb would know).  

Thus you can protect your jobs via their requirements (memory & cpus).  
+ there are plethera of admin recipes that one can apply https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToAdminRecipes more specifically (https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=WholeMachineSlots).

Cheers,
Tim


From: "Dmitry Grudzinskiy" <dgrudzinskiy@xxxxxxxxxx>
To: htcondor-users@xxxxxxxxxxx
Sent: Thursday, June 13, 2013 11:01:11 AM
Subject: [HTCondor-users] Choosing suitable universes for jobs

I asked similar question yesterday (though probably wasn't clear enough) and received a possible solution (thank you) but still feel that I don't understand smth.
Being new to condor I'm getting a little confused when picking suitable universes for our jobs.

We're a java shop and at first I looked only at the java universe. We have both single and multi threaded jobs but none of them require more than one machine. Some of the multi-threaded jobs require many CPUs and consume a lot of memory while we have a lot of very small single threaded ones. Obviously we are concerned about possible starvation problem where small single-threaded jobs won't let the big ones run. Previously with SGE or Moab I would solve it with a flag passed to the submit command (qsub -R) that would reserve a machine for future use and won't let any new jobs start on it.

HTCondor doesn't seem to have a similar solution instead I was advised to use the parallel universe. What concerns me is the fact that this universe seems to be designed for distributed memory parallel tasks aka MPI where my job is just a single process using multiple threads. Though I don't see any problems with just always setting machine_count=1 for these kind of tasks it seems to me like driving a Ferrari on the first gear.

Am I going in the right direction?

Please advise,

Thank you in advance,
Dmitry

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/