[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Dynamic Slots & Parallel Universe



Hi All:

We have currently been working on a 1024 core cluster (8 cores per
machines) using a pretty standard Condor config. Each core shows up as a
single slot, etc.

Users are starting to use multi-process jobs on the cluster - leading to
over scheduling. One way to combat this problem is the "whole machine"
configuration presented on the Wiki at
<https://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=WholeMachineSlots>.
However, most of our users don't require the full machine (combinations
of 2, 3, 4, 5.. cores). We could modify this config to supply slots for
1/2 a machine, etc.

So a couple of questions:
1) Does this seem like a job for dynamic slots? or should we modify the
"whole machine" config?

2) If dynamic slots are the way to go, has this shown to be stable in
production environments?

3) Can we combine the dynamic slot allocations with the Parallel
Universe to provide similar-to-PBS allocations. Something like
machine_count = 4
request_cpus = 8

To match 4 machines with 8 CPUs a piece? Similar to
#PBS -l nodes=4:ppn=8

As always - thanks a lot!
David