Hi! Thanks for the answers Ian! My condor pool consists of 4 machines (3 of them are SMP
machines). The condor status lists the following, Name OpSys Arch State Activity LoadAv
Mem ActvtyTime O2F-sth-LAP-002.un WINNT51 INTEL Unclaimed Idle 0.000
1527 0+00:40:04 slot1@o2f-mbl-lap- WINNT51 INTEL Unclaimed Idle 0.000 1767
0+01:51:58 slot2@o2f-mbl-lap- WINNT51 INTEL Unclaimed Idle 0.000 1767
0+01:57:04 slot1@O2F-STH-LAP- WINNT60 INTEL Unclaimed Idle 0.810
1534 0+02:05:04 slot2@O2F-STH-LAP- WINNT60 INTEL Unclaimed Idle 0.000
1534 0+02:05:05 slot1@o2f-sth-lap- WINNT61 INTEL Unclaimed Idle 0.000
1767 0+01:21:24 slot2@o2f-sth-lap- WINNT61 INTEL Unclaimed Idle 0.000
1767 0+01:21:25 Total Owner Claimed Unclaimed Matched
Preempting Backfill INTEL/WINNT51 3 0 0 3
0 0 0 INTEL/WINNT60 2 0 0 2
0 0 0 INTEL/WINNT61 2 0 0 2
0 0 0 Total 7 0 0 7
0 0 0 The different colors mark different machines. The central manager is marked with green. When I submit a job the only machine that changes the status
from unclaimed to claimed is the central manager (condor_status below). Name OpSys Arch State Activity LoadAv
Mem ActvtyTime O2F-sth-LAP-002.un WINNT51 INTEL Unclaimed Idle 0.000
1527 0+00:45:04 slot1@o2f-mbl-lap- WINNT51 INTEL Unclaimed Idle 0.000 1767
0+01:51:58 slot2@o2f-mbl-lap- WINNT51 INTEL Unclaimed Idle 0.000 1767
0+01:57:04 slot1@O2F-STH-LAP- WINNT60 INTEL Unclaimed Idle 0.810
1534 0+02:05:04 slot2@O2F-STH-LAP- WINNT60 INTEL Unclaimed Idle 0.000
1534 0+02:05:05 slot1@o2f-sth-lap- WINNT61 INTEL Claimed Busy 0.000
1767 0+00:00:05 slot2@o2f-sth-lap- WINNT61 INTEL Claimed Busy 0.000
1767 0+00:00:05 Total Owner Claimed Unclaimed Matched
Preempting Backfill INTEL/WINNT51 3 0 0 3
0 0 0 INTEL/WINNT60 2 0 0 2
0 0 0 INTEL/WINNT61 2
0 2 0 0 0 0 Total 7 0 2 5
0 0 0 Why it’s only the central manager that changes to claimed? I want all the machines to execute jobs but only the central
manager can submit jobs. All the machines have START=TRUE and STARTD in the DAEMON_LIST. >Just for some clarification: is this
the condor_credd daemon running on your central manager machine? Yes, condor_credd is running only on the central machine. >You only need one credd daemon for an entire
pool, not one on each machine. >Every machine should be connecting to
the condor_credd daemon on your central manager to get credentials for users. Is this done by default? If not, how should I indicate it? Another question: I have tried to run condor_birdwatcher but it says that condor
is off, although I believe condor is running. How does condor birdwatcher work? Cheers, Sónia Från: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] För Ian Chesal On Fri, Sep 3, 2010 at 8:50 AM, Sónia Liléo <sonia.lileo@xxxxx> wrote: Hi again! The jobs are now running in
the central manager. I added STARTD to the daemon_list. Perfect. Nice work.
If the state of the machine is still Owner it means START =
False on the box and that's why it isn't running your jobs.
Just for some clarification: is this the condor_credd daemon
running on your central manager machine? You only need one credd daemon for an
entire pool, not one on each machine. Every machine should be connecting to the
condor_credd daemon on your central manager to get credentials for users.
This is from the machine where jobs are not running but you
would like them to run? That last line indicates the machine is Unclaimed -- so
START != False and the machine could potentially run jobs. Can you show me the output of condor_status and indicate
which machine you'd like the jobs to be running on?
It's hard to say at this point. - Ian Cycle Computing, LLC |