[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Reserve all cores on a server?



I have a problem where users are submitting jobs that are running programs with multiple threads but I’m not sure how to restrict this or to prevent other jobs running on the same server.

 

So the submit file could look like this:

 

Executable     =  run.sh

Universe       = vanilla

error          =  run.err.$(Cluster).$(Process)

output         = run.out.$(Cluster).$(Process)

log            =run.log

Queue 10

 

And run.sh contains some file checks and commands to exec eg.

 

blastn -query data.fastq -task megablast -num_threads 4 -db nt

 

The problem occurs when the user submits this and we end up with 4 of these jobs running on the same server – which only has 4 cores – and each job has requested 4 threads so the server is overloaded by several hundred percent now and swapping jobs in and out like crazy. Obviously performance is terrible.

 

I’m not sure how to solve this – I could put in requirements = regexp(“slot1_1”,Machine) so it only runs one copy per server but I can’t see how to stop other users jobs of a different type coming in and filling the other slots on the same server.

 

Any ideas?

 

Thanx,

 

Russell Smithies

=======================================================================
Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately.
=======================================================================