[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Does birectional flocking also facilitate job forwarding?



On Sat, Dec 05, 2009 at 04:53:30PM -0800, Rob wrote:
> 
> Hello,
> 
> Imagine I have three condor masters, each managing their
> own pool of PCs. The masters are connected by "bidirectional"
> flocking as follows:
> 
>    master1  <---->  master2  <----> master3
> 
> so that master1/master2 can exchange jobs, and master2/master3
> can exchange jobs.
> 
> Is it then possible that jobs, submitted to master1, will be executed
> on a pool PC of master3?
> In other words, does flocking allow job forwarding by master2?

not in the setup you described.  flocking works by having the condor_schedd
(the condor daemon that manages the jobs queue) report to the condor_collector
(what you called the "master" node) in another pool.  so the pool "master3" in
the above scenario has no knowledge of the jobs in pool "master1".  it's like
the condor_schedd is just borrowing the execute nodes of the other pool.

however,
1) you can flock to multiple pools.  master1 can set its FLOCK_TO setting
in the condor_config file to a list of collectors, and then jobs will run in
master2 and master3.  likewise for the other pools.  this is the easiest to
setup up... see below.

2) instead of flocking, actual job forwarding can be done using the JobRouter 
and Condor-C, both documented well in the manual.  jobs forwarded from one pool
will exist in the schedd of the other pool, which is different from flocking.

that's a pretty high-level answer.  if you need more details please just let us
know.


> ---------------
> 
> Another question: to configure bidirectional flocking, can I simply add
> the following in my master's condor_config.local file:
>    FLOCK_FROM = <comma separated list of hostnames and/or IP-numbers>
>    FLOCK_TO = ${FLOCK_FROM}

FLOCK_FROM looks good.  also, you can use wildcards, like *.your.net and
192.168.0.* so that you don't have to enumerate every machine.

for FLOCK_TO you'll want to it to the list of other machines running the
condor_collector daemon.  example:

condor_config (in pool master1)
  FLOCK_TO = centralmanager-pool2.your.net, centralmanager-pool3.your.net

condor_config (in pool master2)
  FLOCK_TO = centralmanager-pool3.your.net, centralmanager-pool1.your.net

condor_config (in pool master3)
  FLOCK_TO = centralmanager-pool1.your.net, centralmanager-pool2.your.net


hope that helps!


cheers,
-zach