[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Confused: condor_q -name option with flocking



On Aug 18, 2005, at 6:49 AM, John Horne wrote:

I am starting to look at flocking from a Condor 6.6.10 system to a pool on
a Condor 6.7.6 system. I am, however, a bit confused with some of the
condor command options.


The condor_status command states (from the 6.6 on-line manual):

-pool centralmanagerhostname[:portnumber]
(Query option) Query the specified central manager using an optional
port number. (condor_ status queries COLLECTOR_HOST by default)


Now this seems to make sense, and the condor_q command states the same for
'-pool' but then states:


  -name name
     Show only the job queue of the named schedd

It also states further up:

a schedd name with the -name option, which causes the queue of the named
schedd to be queried


The condor_status command seems to work fine with the '-pool' option, but
for condor_q it seems I have to use both '-pool' and '-name'.


The confusion is what do I actually specify for the '-name' option? What
is 'a schedd name'? For the '-pool' option I have been specifying the FQDN
of the server I am flocking to, I have done the same with the '-name'
option and that seems to work (I can see the job queue on the remote
server). But what then is the point of having to use both '-pool' and
'-name' if they are both the same?

The short answer is that you don't care about condor_q -name when flocking. Yours jobs are in your local schedd's queue and no other queue.


Now the long answer: A Condor pool is defined by the central manager (the collector, really). All Condor daemons that report to the same collector are part of the same pool. A pool only has one central manager (there are options for hot spares, but let's ignore them). A pool can have many schedds, each with its own queue of jobs.

When a schedd flocks, it talks to the central manager of a remote pool. It doesn't join that pool, but merely says it's interested in being matched with resources in that pool. The schedd is talking to the collector (and the negotiator once it's time to match-make) of the remote pool. It does not talk to any of the schedds in the remote pool, of which there may be many. The jobs do not get forwarded to a remote job queue.

Now for the -name and -pool options to condor_q. Since there can be many schedds in a pool, we need a way to distinguish them. We do this by giving them different names. Normally, a schedd's name is simply the name of the machine it's running on. If you want to run multiple schedds on the same machine, you need to give them different names (using the SCHEDD_NAME parameter in the config file). 'condor_status - schedd' will show you all the schedds in your pool.

When you run condor_q (or any of the tools that talk to a schedd), it tries to talk to whatever schedd is running on your local machine. If you want it to talk to a different schedd, you use the -name option. If you want it to talk to a schedd in a different pool, you also have to use the -pool option to give the name of the pool (which is the name of the collector for that pool).

condor_status talks to the collector, so having both -name and -pool is redundant for it, since the name of the collector is the name of the pool.

+----------------------------------+---------------------------------+
|            Jaime Frey            |  Public Split on Whether        |
|        jfrey@xxxxxxxxxxx         |  Bush Is a Divider              |
|  http://www.cs.wisc.edu/~jfrey/  |         -- CNN Scrolling Banner |
+----------------------------------+---------------------------------+