[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] CondorView, condor_stats



On Wed, Jul 21, 2004 at 11:34:37AM -0500, Andrew Zahn wrote:
> I believe the problems with the condorview not working are a result of
> basic config issues.
> 
> I am having a problem setting up the CondorView server. You are to setup
> up a different condor collector running on another machine that will act
> as the condorview server.

Yes. Technically, you don't have to, it's just usually a good idea.

> I cannot determine anyway to test that this
> collector is operating properly. Is there a way to ping the condor
> collector? 

condor_status -pool viewservername.wherever.your.domain

> ? I don't see a condor_view in the running processes.

There is no condor_view.exe - it's just the condor_collector.

> Also,
> it appeared to create files in the directory I specified for history but
> they are all of 0 size. Does this condor install have to be configured
> fully? ie. does it have to be configured to accept job submits or can it
> function only for condorview purposes?
> 

It can function only as a condor-view server if you want. 

> Also it replys heavily on condor_stats, which doesn't seem to work on my
> machine. no matter what parameters I give it, the return is always the
> help display. So possibly there is something wrong with my condor_stats?
> 

Do you have condor_view_host defined in your config file, and pointing
at the view server? It will also need to be in the config file of your
central manager. 

> below is an excerpt from my MasterLog on the Tier2-03 machine (the
> CondorView server)  where it is showing some errors with the collector.
> Below that is an excerpt from the collectorlog where it shows that it
> seems to be collecting the data and writing to those files but in
> reality the actual files are dated yesterday and they have size 0.
> 

What do you have your DAEMON_LIST set to? You cannot have both
COLLECTOR and VIEW_SERVER set in it. And you should take out 
NEGOTIATOR.

-Erik

> 
> MasterLog:
> 
> 7/21 11:00:07 ******************************************************
> 7/21 11:00:07 ** condor_master (CONDOR_MASTER) STARTING UP
> 7/21 11:00:07 ** $CondorVersion: 6.6.5 May  3 2004 $
> 7/21 11:00:07 ** $CondorPlatform: I386-LINUX-RH72 $
> 7/21 11:00:07 ** PID = 5577
> 7/21 11:00:07 ******************************************************
> 7/21 11:00:07 Using config file: /vdt/condor/etc/condor_config
> 7/21 11:00:07 Using local config files: 
> /vdt/condor/local.tier2-03/condor_config.local
> 7/21 11:00:07 DaemonCore: Command Socket at <10.255.255.221:52580>
> 7/21 11:00:07 Log file not found in config file: VIEW_SERVER_LOG
> 7/21 11:00:07 Started DaemonCore process 
> "/vdt/condor/sbin/condor_collector", pid and pgroup = 5578
> 7/21 11:00:07 Started DaemonCore process 
> "/vdt/condor/sbin/condor_negotiator", pid and pgroup = 5579
> 7/21 11:00:07 Started DaemonCore process 
> "/vdt/condor/sbin/condor_startd", pid and pgroup = 5580
> 7/21 11:00:07 Started DaemonCore process 
> "/vdt/condor/sbin/condor_schedd", pid and pgroup = 5581
> 7/21 11:00:07 Create_Process:Failed to post listen on command socket(s) 
> (port 9618)
> 7/21 11:00:07 ERROR: Create_Process failed trying to start 
> /vdt/condor/sbin/condor_collector
> 7/21 11:00:07 restarting /vdt/condor/sbin/condor_collector in 10 seconds
> 7/21 11:00:17 Create_Process:Failed to post listen on command socket(s) 
> (port 9618)
> 7/21 11:00:17 ERROR: Create_Process failed trying to start 
> /vdt/condor/sbin/condor_collector
> 7/21 11:00:17 restarting /vdt/condor/sbin/condor_collector in 120 seconds
> 7/21 11:02:17 Create_Process:Failed to post listen on command socket(s) 
> (port 9618)
> 7/21 11:02:17 ERROR: Create_Process failed trying to start 
> /vdt/condor/sbin/condor_collector
> 7/21 11:02:17 restarting /vdt/condor/sbin/condor_collector in 10 seconds
> 7/21 11:02:27 Create_Process:Failed to post listen on command socket(s) 
> (port 9618)
> 7/21 11:02:27 ERROR: Create_Process failed trying to start 
> /vdt/condor/sbin/condor_collector
> 7/21 11:02:27 restarting /vdt/condor/sbin/condor_collector in 11 seconds
> 7/21 11:02:38 Create_Process:Failed to post listen on command socket(s) 
> (port 9618)
> 7/21 11:02:38 ERROR: Create_Process failed trying to start 
> /vdt/condor/sbin/condor_collector
> 7/21 11:02:38 restarting /vdt/condor/sbin/condor_collector in 13 seconds
> 7/21 11:02:51 Create_Process:Failed to post listen on command socket(s) 
> (port 9618)
> 7/21 11:02:51 ERROR: Create_Process failed trying to start 
> /vdt/condor/sbin/condor_collector
> 7/21 11:02:51 restarting /vdt/condor/sbin/condor_collector in 17 seconds
> 7/21 11:03:08 Create_Process:Failed to post listen on command socket(s) 
> (port 9618)
> 7/21 11:03:08 ERROR: Create_Process failed trying to start 
> /vdt/condor/sbin/condor_collector
> 7/21 11:03:08 restarting /vdt/condor/sbin/condor_collector in 25 seconds
> 7/21 11:03:33 Create_Process:Failed to post listen on command socket(s) 
> (port 9618)
> 7/21 11:03:33 ERROR: Create_Process failed trying to start 
> /vdt/condor/sbin/condor_collector
> 7/21 11:03:33 restarting /vdt/condor/sbin/condor_collector in 41 seconds
> 7/21 11:04:14 Create_Process:Failed to post listen on command socket(s) 
> (port 9618)
> 7/21 11:04:14 ERROR: Create_Process failed trying to start 
> /vdt/condor/sbin/condor_collector
> 7/21 11:04:14 restarting /vdt/condor/sbin/condor_collector in 73 seconds
> 7/21 11:05:27 Create_Process:Failed to post listen on command socket(s) 
> (port 9618)
> 7/21 11:05:27 ERROR: Create_Process failed trying to start 
> /vdt/condor/sbin/condor_collector
> 7/21 11:05:27 restarting /vdt/condor/sbin/condor_collector in 137 seconds
> 
> 
> 
> CollectorLog:
> 
> 7/21 11:00:07 ******************************************************
> 7/21 11:00:07 ** condor_collector (CONDOR_COLLECTOR) STARTING UP
> 7/21 11:00:07 ** $CondorVersion: 6.6.5 May  3 2004 $
> 7/21 11:00:07 ** $CondorPlatform: I386-LINUX-RH72 $
> 7/21 11:00:07 ** PID = 5578
> 7/21 11:00:07 ******************************************************
> 7/21 11:00:07 Using config file: /vdt/condor/etc/condor_config
> 7/21 11:00:07 Using local config files: 
> /vdt/condor/local.tier2-03/condor_config.local
> 7/21 11:00:07 DaemonCore: Command Socket at <10.255.255.221:9618>
> 7/21 11:00:07 In ViewServer::Init()
> 7/21 11:00:07 In CollectorDaemon::Init()
> 7/21 11:00:07 In ViewServer::Config()
> 7/21 11:00:07 In CollectorDaemon::Config()
> 7/21 11:00:07 enable: Creating stats hash table
> 7/21 11:00:07 Configuration: SAMPLING_INTERVAL=60, MAX_STORAGE=10000000, 
> MaxFileSize=333333, POOL_HISTORY_DIR=/vdt/condor/local.tier2-03/history
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist0.0.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist0.0.new , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist0.1.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist0.1.new , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist0.2.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist0.2.new , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist1.0.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist1.0.new , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist1.1.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist1.1.new , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist1.2.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist1.2.new , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist2.0.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist2.0.new , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist2.1.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist2.1.new , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist2.2.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist2.2.new , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist3.0.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist3.0.new , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist3.1.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist3.1.new , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist3.2.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist3.2.new , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist4.0.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist4.0.new , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist4.1.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist4.1.new , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist4.2.old , StartTime=-1
> 7/21 11:00:07 
> FileName=/vdt/condor/local.tier2-03/history/viewhist4.2.new , StartTime=-1
> 7/21 11:05:08 Accumulating data: Time=1090425908
> 7/21 11:06:08 Accumulating data: Time=1090425968
> 7/21 11:07:08 Accumulating data: Time=1090426028
> 7/21 11:08:08 Accumulating data: Time=1090426088
> 7/21 11:09:08 Accumulating data: Time=1090426148
> 7/21 11:10:08 Accumulating data: Time=1090426208
> 7/21 11:11:08 Accumulating data: Time=1090426268
> 7/21 11:12:08 Accumulating data: Time=1090426328
> 7/21 11:13:08 Accumulating data: Time=1090426388
> 
> 
> 
> 
> 
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> http://lists.cs.wisc.edu/mailman/listinfo/condor-users