[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] ProcAPI sanity failure, age = -98161996



Here is the corresponding StarterLog.vm2 on ORCLUS01.na.sas.com.

Thanks,
Matt


3/13 10:57:06 ******************************************************
3/13 10:57:06 ** condor_starter (CONDOR_STARTER) STARTING UP
3/13 10:57:06 ** C:\condor\bin\condor_starter.exe
3/13 10:57:06 ** $CondorVersion: 6.7.17 Feb 18 2006 $
3/13 10:57:06 ** $CondorPlatform: INTEL-WINNT50 $
3/13 10:57:06 ** PID = 3660
3/13 10:57:06 ******************************************************
3/13 10:57:06 Using config file: C:\condor\condor_config
3/13 10:57:06 Using local config files: C:\condor/condor_config.local
3/13 10:57:06 DaemonCore: Command Socket at <10.40.12.183:4696>
3/13 10:57:06 SEC_DEFAULT_SESSION_DURATION is undefined, using default value of 3600
3/13 10:57:06 Setting resource limits not implemented!
3/13 10:57:06 STARTER_TIMEOUT_MULTIPLIER is undefined, using default value of 0
3/13 10:57:06 Communicating with shadow <10.40.12.183:4689>
3/13 10:57:06 Shadow version: $CondorVersion: 6.7.17 Feb 18 2006 $
3/13 10:57:06 Submitting machine is "ORCLUS01.na.sas.com"
3/13 10:57:06 ShouldTransferFiles is "YES", transfering files
3/13 10:57:06 STARTER_ALLOW_RUNAS_OWNER is undefined, using default value of False
3/13 10:57:06 init_user_ids: want user 'nobody@.', current is '(null)@(null)'
3/13 10:57:06 Using dynamic user account.
3/13 10:57:06 dynuser: Re-enabling account (condor-reuse-vm2)
3/13 10:57:06 dynuser::createuser(condor-reuse-vm2) successful
3/13 10:57:06 perm::init() starting up for account (condor-reuse-vm2) domain (NULL)
3/13 10:57:06 perm::init: Found Account Name condor-reuse-vm2
3/13 10:57:06 Done moving to directory "C:\condor\execute\dir_3660"
3/13 10:57:06 TokenCache contents: 
condor-reuse-vm2@.
3/13 10:57:06 JICShadow::initIOProxy(): Job does not define WantIOProxy
3/13 10:57:06 No StarterUserLog found in job ClassAd
3/13 10:57:06 Starter will not write a local UserLog
3/13 10:57:06 Changing the executable name
3/13 10:57:06 entering FileTransfer::Init
3/13 10:57:06 entering FileTransfer::SimpleInit
3/13 10:57:06 TransferIntermediate="(none)"
3/13 10:57:06 entering FileTransfer::DownloadFiles
3/13 10:57:06 STARTER_TIMEOUT_MULTIPLIER is undefined, using default value of 0
3/13 10:57:06 entering FileTransfer::Download
3/13 10:57:06 About to sock duplicate, old sock=6C0 new sock=FFFFFFFF state=0
3/13 10:57:06 Socket duplicated, old sock=6C0 new sock=698 state=0
3/13 10:57:06 In win32_thread_start_func
3/13 10:57:06 entering FileTransfer::DownloadThread
3/13 10:57:06 entering FileTransfer::DoDownload sync=1
3/13 10:57:06 TokenCache contents: 
condor-reuse-vm2@.
3/13 10:57:06 get_file(): going to write to filename C:\condor/execute\dir_3660\condor_exec.exe
3/13 10:57:06 get_file: Receiving 473 bytes
3/13 10:57:06 get_file: wrote 473 bytes to file
3/13 10:57:06 ReliSock::get_file_with_permissions(): received null permissions from peer, not setting
3/13 10:57:06 ProcAPI sanity failure, cpuusage = -0.000000
3/13 10:57:06 ProcAPI sanity failure, age = -98162766
3/13 10:57:06 STARTER_TIMEOUT_MULTIPLIER is undefined, using default value of 0
3/13 10:57:06 File transfer completed successfully.
3/13 10:57:07 Calling client FileTransfer handler function.
3/13 10:57:07 Job 13.1 set to execute immediately
3/13 10:57:07 DaemonCore: in SendAliveToParent()
3/13 10:57:07 DaemonCore: attempting to connect to '<10.40.12.183:1737>'
3/13 10:57:07 STARTER_TIMEOUT_MULTIPLIER is undefined, using default value of 0
3/13 10:57:07 SEC_TCP_SESSION_TIMEOUT is undefined, using default value of 20
3/13 10:57:07 Starting a VANILLA universe job with ID: 13.1
3/13 10:57:07 In OsProc::OsProc()
3/13 10:57:07 Main job KillSignal: 15 (Unknown)
3/13 10:57:07 Main job RmKillSignal: 15 (Unknown)
3/13 10:57:07 Main job HoldKillSignal: 15 (Unknown)
3/13 10:57:07 in VanillaProc::StartJob()
3/13 10:57:07 Executable is .bat, so running C:\WINDOWS\system32\cmd.exe /Q /C condor_exec.bat
3/13 10:57:07 in OsProc::StartJob()
3/13 10:57:07 IWD: C:\condor/execute\dir_3660
3/13 10:57:07 TokenCache contents: 
condor-reuse-vm2@.
3/13 10:57:07 Input file: NUL
3/13 10:57:07 Output file: C:\condor/execute\dir_3660\hello1.out
3/13 10:57:07 Error file: NUL
3/13 10:57:07 Renice expr "10" evaluated to 10
3/13 10:57:07 About to exec C:\WINDOWS\system32\cmd.exe condor_exec.exe /Q /C condor_exec.bat
3/13 10:57:07 Env = _CONDOR_SCRATCH_DIR=C:\condor\execute\dir_3660
3/13 10:57:07 GetBinaryType() returned 0
3/13 10:57:07 TokenCache contents: 
condor-reuse-vm2@.
3/13 10:57:07 Create_Process: CreateProcess failed, errno=5
3/13 10:57:07 ERROR "Create_Process(C:\WINDOWS\system32\cmd.exe,condor_exec.exe /Q /C condor_exec.bat, ...) failed" at line 373 in file ..\src\condor_starter.V6.1\os_proc.C
3/13 10:57:07 ShutdownFast all jobs.
3/13 10:57:07 Got ShutdownFast when no jobs running.
3/13 10:57:31 NET_REMAP_ENABLE is undefined, using default value of False



> > Here's the shadow log on the submit machine - I am not sure 
> if that helps... 
> > 
> 
> What would be more useful would be StarterLog.vm2 on 
> ORCLUS01.na.sas.com
> 
> > 
> > In the MasterLog, I also keep seeing the following: 
> "ProcAPI sanity failure, age = xxxx". This error seems serious.
> 
> I think we fixed this bug just this morning (the tyep we were 
> using didn't have enough precision, hence the bogus value) - 
> it will be in 6.7.18.
> 
> -Erik