[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor 7.7.5: MasterLog is flooded with ProcAPI errors...



Problem occurred while parsing stat file "/proc/999/stat"
Condor expects string without white spaces between parentheses.

>999 (ddclient - slee) S 1 988 988 0 -1 4202560 29441 202449 0 0 69 54 90 208 20 0 1 0 5837 10022912 1249 4294967295 134512640 134516240 3217069360 3217068344 8217622 0 0 128 16385 3225853652 0 0 17 0 0 0 0 0 0 134520336 134520752 152264704

But there is question: Why are there white spaces?
Because according to manual pages "man 5 proc", there should be:

   The  filename  of  the  executable,  in parentheses.
   This is visible whether or  not  the  executable  is
   swapped out.

Regards,
Lukas

On Wed, May 23, 2012 at 08:37:12PM -0700, Rob wrote:
> Hi,
> 
> After starting the Master daemon, the MasterLog looks like this:
> 
> 
> 05/24/12 12:20:22 ******************************************************
> 05/24/12 12:20:22 ** condor_master (CONDOR_MASTER) STARTING UP
> 05/24/12 12:20:22 ** /usr/sbin/condor_master
> 05/24/12 12:20:22 ** SubsystemInfo: name=MASTER type=MASTER(2) class=DAEMON(1)
> 05/24/12 12:20:22 ** Configuration: subsystem:MASTER local:<NONE> class:DAEMON
> 05/24/12 12:20:22 ** $CondorVersion: 7.7.5 Mar 07 2012 $
> 05/24/12 12:20:22 ** $CondorPlatform: I686-Fedora_16 $
> 05/24/12 12:20:22 ** PID = 3347
> 05/24/12 12:20:22 ** Log last touched 5/24 12:20:07
> 05/24/12 12:20:22 ******************************************************
> 05/24/12 12:20:22 Using config source: /etc/condor/condor_config
> 05/24/12 12:20:22 Using local config sources: 
> 05/24/12 12:20:22    /etc/condor/config.d/00personal_condor.config
> 05/24/12 12:20:22    /etc/condor/config.d/01personal_condor.config
> 05/24/12 12:20:22 DaemonCore: command socket at <25.125.10.62:45300>
> 05/24/12 12:20:22 DaemonCore: private command socket at <25.125.10.62:45300>
> 05/24/12 12:20:22 Setting maximum accepts per cycle 8.
> 05/24/12 12:20:22 Started DaemonCore process "/usr/sbin/condor_collector", pid and pgroup = 3348
> 05/24/12 12:20:22 Waiting for /var/log/condor/.collector_address to appear.
> 05/24/12 12:20:23 Found /var/log/condor/.collector_address.
> 05/24/12 12:20:23 Started DaemonCore process "/usr/sbin/condor_negotiator", pid and pgroup = 3349
> 05/24/12 12:20:23 Started DaemonCore process "/usr/sbin/condor_schedd", pid and pgroup = 3350
> 05/24/12 12:20:24 ProcAPI: Unexpected short scan on /proc/999/stat, errno: 11.
> 05/24/12 12:20:24 ProcAPI: Unexpected short scan on /proc/999/stat, errno: 11.
> 05/24/12 12:20:24 ProcAPI: Unexpected short scan on /proc/999/stat, errno: 11.
> 05/24/12 12:20:24 ProcAPI: Unexpected short scan on /proc/999/stat, errno: 11.
> 05/24/12 12:20:24 ProcAPI: Unexpected short scan on /proc/999/stat, errno: 11.
> 
> and zillions more of the last lines continue to flood the MasterLog at a rate of 5 lines per second.
> 
> The program with PID 999 has nothing to do with Condor. The file /proc/999/stat contains
> 
> 999 (ddclient - slee) S 1 988 988 0 -1 4202560 29441 202449 0 0 69 54 90 208 20 0 1 0 5837 10022912 1249 4294967295 134512640 134516240 3217069360 3217068344 8217622 0 0 128 16385 3225853652 0 0 17 0 0 0 0 0 0 134520336 134520752 152264704
> 
> If relevant:
> this is currently happening on two computers with kernels 3.3.4-1.fc16 and 3.3.6-3.fc16; the first is the central Condor master (with a large pool), the second is a Condor master (without a pool) that flocks to the first one.
> 
> 
> Why is this happening?
> Or is this an indication that something is going wrong elsewhere?
> Any suggestions?
> 
> 
> Rob.
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>