[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] two central managers on the same network




Tom,

From your description, it sounds like COLLECTOR_HOST in the configuration of m099 is pointing to the old central manager ladon.

Try this command:

condor_config_val -v COLLECTOR_HOST

--Dan

Tom Ghekiere wrote:
Hi all,


We currently have a Condor cluster (version 7.0.0) with a master node (hostname: ladon; IP: 192.168.0.100 ) that we would like to replace by a newer one (hostname: mo99; IP: 192.168.0.101). Both servers are also connected to the internet. All worker nodes are only connected to the LAN (192.168.0.xxx). All systems have Linux OS.

I installed condor on mo99 and configured it (changing all the 'ladon' to 'mo99' and 192.168.0.100 to 192.168.0.101 and also changing COLLECTOR_NAME). I removed a worker node (node14) from the ladon pool and configured it to join mo99.
I started condor on mo99 and node14.

This is what condor_status on node14 returns:

Name OpSys Arch State Activity LoadAv Mem ActvtyTime

slot1@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+01:35:04 slot2@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+01:35:05 slot3@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+01:35:06 slot4@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+01:35:07 slot5@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+01:35:08 slot6@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+01:35:09 slot7@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+01:35:10 slot8@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+01:35:03

                        Total Owner Claimed Unclaimed Matched
   Preempting Backfill

X86_64/LINUX 8 0 0 8 0 0 0

Total 8 0 0 8 0 0 0



On mo99, however, condor_status returns:

Name OpSys Arch State Activity LoadAv Mem ActvtyTime

slot1@xxxxxxxxxxxx LINUX X86_64 Owner Idle 0.060 2011 9+23:15:55 slot2@xxxxxxxxxxxx LINUX X86_64 Owner Idle 0.000 2011 9+23:15:56 slot3@xxxxxxxxxxxx LINUX X86_64 Owner Idle 0.000 2011 9+23:15:57 slot4@xxxxxxxxxxxx LINUX X86_64 Owner Idle 0.000 2011 9+23:15:58 slot5@xxxxxxxxxxxx LINUX X86_64 Owner Idle 0.000 2011 9+23:15:59 slot6@xxxxxxxxxxxx LINUX X86_64 Owner Idle 0.000 2011 9+23:16:00 slot7@xxxxxxxxxxxx LINUX X86_64 Owner Idle 0.000 2011 9+23:16:01 slot8@xxxxxxxxxxxx LINUX X86_64 Owner Idle 0.000 2011 9+23:15:54
   ...
   ...
   ...
slot1@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+01:35:14 slot2@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+01:35:19 slot3@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+01:35:18 slot4@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+02:15:54 slot5@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+00:49:52 slot6@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+00:51:16 slot7@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+00:48:06 slot8@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 2011 0+00:52:06

                        Total Owner Claimed Unclaimed Matched
   Preempting Backfill

X86_64/LINUX 160 8 62 90 0 0 0

Total 160 8 62 90 0 0 0



This is exactly the same as what condor_status returns on ladon (master that needs to be phased out). To me, it seems mo99 is connecting with the wrong pool.

Does anyone have an idea of what I am doing wrong and perhaps how to solve this?


Thanks,

Tom
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/