Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Guiding machine choices for the parallel universe

Date: Thu, 4 Dec 2008 15:11:53 +0000
From: "Alan Woodland" <alan.woodland@xxxxxxxxx>
Subject: [Condor-users] Guiding machine choices for the parallel universe

Hi,

Is there a way to "guide" condor's choice of nodes to satisfy a
parallel universe job automatically? Basically the cost of
communication between all nodes in one of my pools is not equal
because of network topology. Given this I'm looking for a way to make
the Dedicated Scheduler aware of this and prefer to match nodes that
are close to each other on the network, but not prevent larger
parallel jobs using all the machines.

Clearly users could write a requirements =  or rank = line in their
job submission file, but I don't think it's reasonable or fair to
expect users to be doing this.

I was thinking of doing something along the lines of logically
dividing the nodes into n sub-pools (within which connectivity is good
between all the nodes), and giving each sub-pool a number. This would
then mean that an expression something like:

NEGOTIATOR_PRE_JOB_RANK = (MY.Universe == PARALLEL) *
((free_nodes_in_my_subpool - MY.Machine_count) * my_subpool_id)

Where sub-pool ID's were suitably large would achieve this. Obviously
this isn't syntactically correct just yet!

Actually in practice doing this is slightly harder than I'd hoped.

Firstly is it true to say that (False * x) == 0? and (True * x) == x?

Secondly how would I go about writing an expression that maps machine
names into some (pre-defined) sub-pool Id's? Or am I better off
putting that as a custom attribute in the startds ads?

Thirdly is MachineCount an attribute in parallel universe job
classads? I can't see it listed in
http://www.cs.wisc.edu/condor/manual/v7.0/Appendix_A_ClassAd.html

Fourthly how could free_nodes_in_my_subpool be implemented?

Or generally is there a nicer way to solve this without topological
changes to the network or intervention from each user of the parallel
universe?

Thanks,
Alan

Follow-Ups:
- Re: [Condor-users] Guiding machine choices for the parallel universe
  - From: Greg Thain

Prev by Date: [Condor-users] Parsing quoted arguments in Condor
Next by Date: [Condor-users] Need to change the condor mode from execute to only submit jobs
Previous by thread: Re: [Condor-users] Parsing quoted arguments in Condor
Next by thread: Re: [Condor-users] Guiding machine choices for the parallel universe
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

[Condor-users] Guiding machine choices for the parallel universe