[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Two pieces of software with different threading models using the same pool



Here's a flushed out, easy to follow example, if it is what you are looking for:

"How to allow some jobs to claim the whole machine instead of one slot"
https://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=WholeMachineSlots

It should be fairly straight forward to change the sample to your needs.

Regards,
-B

On 2010-07-01, at 4:08 PM, David Kotz wrote:

> Neil,
> 
> If I understand your problem correctly, for your first job type, you
> want each job to claim an entire machine exclusively, and for your
> second job type, you want to allow up to NUM_CPUs/2 to run on a machine,
> as long as there are no jobs of the first type running.
> 
> In that case, you can refer to this thread from the list archive:
> 
> https://lists.cs.wisc.edu/archive/condor-users/2007-June/msg00295.shtml
> 
> You'd want to define two slot types:
> 
> Type 1 has all CPUs, memory, etc. on the machine.  NUM_SLOTS_TYPE_1 = 1.
> Type 1's START expression evaluates to FALSE if any of the other slot is
> claimed.
> 
> Type 2 is a mostly normal slot definition, except that each one claims
> two CPUs.  NUM_SLOTS_TYPE_2 = NUM_CPUS/2.  Type 2's START expression
> evaluates to FALSE if the state of slot Type 1 is "Claimed".
> 
> 
> Is that what you're trying to achieve?
> 
> - dave
> 
> 
> 
> On Thu, 2010-07-01 at 14:30 -0400, Neil Woodhouse wrote:
>> Fellow Condor Users,
>> 
>> 
>> 
>>                I have two pieces of software, from different vendors,
>> that are Condor enabled. One of which I have some control over. They
>> use different threading models one uses all of the CPUs on the node
>> and the NUM_CPUs is set to 1, or one slot. The other software, of
>> which I can control, knows it’s limitations and uses 2 threads
>> optimally. The number of slots is divided by 2 on each machine and the
>> NUM_CPUs set to that value. The first piece of software needs to use
>> one slot and fails miserably if the number is changed.
>> 
>> 
>> 
>>                Does anyone have an idea of how I may run both
>> software programs together and concurrently? I am assuming that I may
>> change the requirements. These processes take hours to process, so at
>> any one time any slot may be being used. Dividing the pool is a
>> possibility, but the downtime for some of the resources may be too
>> long and the machines should keep chugging along. 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Neil
>> 
>> 
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> 
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
> 
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/