[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Attempting to chown '/opt/condor-6.8.0/local.execnode1/spool/cluster3.proc0.subproc0', but it doesn't appear to exist



Dear All,

 

We’ve just upgraded our pool to Condor-6.8.0 on RHEL 4 and I’ve noticed some strange messages in schedd.log namely:

 

9/8 10:09:05 (pid:10265) Attempting to chown '/opt/condor-6.8.0/local.execnode1/spool/cluster3.proc0.subproc0', but it doesn't appear to exist.

9/8 10:09:05 (pid:10265) Error: Unable to chown '/opt/condor-6.8.0/local.execnode1/spool/cluster3.proc0.subproc0' from 502 to 501.501

9/8 10:09:05 (pid:10265) (3.0) Failed to chown /opt/condor-6.8.0/local.execnode1/spool/cluster3.proc0.subproc0 from 502 to 501.501.  User may run into permissions problems when fetching sandbox.

 

 

Does anyone have any thoughts on what these messages are?  I’ve just noticed them in this version, they weren’t present in 6.6.11.

 

Thanks

 

 

The actual log is below.

 

 

[schedd.log on submit node]

 

9/8 10:07:15 (pid:10205) ******************************************************

9/8 10:07:15 (pid:10205) ** condor_schedd (CONDOR_SCHEDD) STARTING UP

9/8 10:07:15 (pid:10205) ** /opt/condor-6.8.0/sbin/condor_schedd

9/8 10:07:15 (pid:10205) ** $CondorVersion: 6.8.0 Jul 19 2006 $

9/8 10:07:15 (pid:10205) ** $CondorPlatform: I386-LINUX_RHEL3 $

9/8 10:07:15 (pid:10205) ** PID = 10205

9/8 10:07:15 (pid:10205) ** Log last touched 9/8 10:04:27

9/8 10:07:15 (pid:10205) ******************************************************

9/8 10:07:15 (pid:10205) Using config source: /mnt/condor_nfs/dell_optiplex_gx150_config/condor_config

9/8 10:07:15 (pid:10205) Using local config sources:

9/8 10:07:15 (pid:10205)    /opt/condor-6.8.0/local.west153/condor_config.local

9/8 10:07:15 (pid:10205) DaemonCore: Command Socket at <xxx.xxx.xxx.xxx:56508>

9/8 10:07:15 (pid:10205) History file rotation is enabled.

9/8 10:07:15 (pid:10205)   Maximum history file size is: 20971520 bytes

9/8 10:07:15 (pid:10205)   Number of rotated history files is: 2

9/8 10:07:17 (pid:10205) Sent ad to central manager for sjo@xxxxxxxxxx

9/8 10:07:17 (pid:10205) Sent ad to 1 collectors for sjo@xxxxxxxxxx

9/8 10:09:05 (pid:10205) DaemonCore: Command received via TCP from host <xxx.xxx.xxx.xxx:54356>

9/8 10:09:05 (pid:10205) DaemonCore: received command 478 (ACT_ON_JOBS), calling handler (actOnJobs)

9/8 10:09:05 (pid:10265) Attempting to chown '/opt/condor-6.8.0/local.execnode1/spool/cluster3.proc0.subproc0', but it doesn't appear to exist.

9/8 10:09:05 (pid:10265) Error: Unable to chown '/opt/condor-6.8.0/local.execnode1/spool/cluster3.proc0.subproc0' from 502 to 501.501

9/8 10:09:05 (pid:10265) (3.0) Failed to chown /opt/condor-6.8.0/local.execnode1/spool/cluster3.proc0.subproc0 from 502 to 501.501.  User may run into permissions problems when fetching sandbox.

9/8 10:09:11 (pid:10205) DaemonCore: Command received via UDP from host <xxx.xxx.xxx.xxx:33914>

9/8 10:09:11 (pid:10205) DaemonCore: received command 421 (RESCHEDULE), calling handler (reschedule_negotiator)

9/8 10:09:11 (pid:10205) Sent ad to central manager for sjo@xxxxxxxxxx

9/8 10:09:11 (pid:10205) Sent ad to 1 collectors for sjo@xxxxxxxxxx

9/8 10:09:11 (pid:10205) Called reschedule_negotiator()

9/8 10:09:11 (pid:10205) failed to send RESCHEDULE command to negotiator

9/8 10:13:07 (pid:10205) DaemonCore: Command received via TCP from host <xxx.xxx.xxx.xxx:38060>

9/8 10:13:07 (pid:10205) DaemonCore: received command 416 (NEGOTIATE), calling handler (doNegotiate)

9/8 10:13:07 (pid:10205) Negotiating for owner: sjo@xxxxxxxxxx

9/8 10:13:07 (pid:10205) Checking consistency running and runnable jobs

9/8 10:13:07 (pid:10205) Tables are consistent

9/8 10:13:07 (pid:10205) Out of jobs - 1 jobs matched, 0 jobs idle, flock level = 0

9/8 10:13:07 (pid:10205) Sent ad to central manager for sjo@xxxxxxxxxx

9/8 10:13:07 (pid:10205) Sent ad to 1 collectors for sjo@xxxxxxxxxx

9/8 10:13:09 (pid:10205) Starting add_shadow_birthdate(4.0)

9/8 10:13:09 (pid:10205) Started shadow for job 4.0 on "<xxx.xxx.xxx.xxx:55886>", (shadow pid = 10304)

9/8 10:13:09 (pid:10205) Shadow pid 10304 for job 4.0 exited with status 100

9/8 10:13:10 (pid:10205) match (<xxx.xxx.xxx.xxx:55886>#1157706435#1) out of jobs (cluster id 4); relinquishing

9/8 10:13:10 (pid:10205) Sent RELEASE_CLAIM to startd on <xxx.xxx.xxx.xxx:55886>

9/8 10:13:10 (pid:10205) Match record (<xxx.xxx.xxx.xxx:55886>, 4, -1) deleted

9/8 10:13:10 (pid:10205) DaemonCore: Command received via TCP from host <xxx.xxx.xxx.xxx:38513>

9/8 10:13:10 (pid:10205) DaemonCore: received command 443 (VACATE_SERVICE), calling handler (vacate_service)

9/8 10:13:10 (pid:10205) Got VACATE_SERVICE from <xxx.xxx.xxx.xxx:38513>

9/8 10:13:12 (pid:10205) Sent owner (0 jobs) ad to 1 collectors

9/8 10:15:39 (pid:10205) DaemonCore: Command received via UDP from host <xxx.xxx.xxx.xxx:33932>

9/8 10:15:39 (pid:10205) DaemonCore: received command 421 (RESCHEDULE), calling handler (reschedule_negotiator)

9/8 10:15:39 (pid:10205) Sent ad to central manager for sjo@xxxxxxxxxx

9/8 10:15:39 (pid:10205) Sent ad to 1 collectors for sjo@xxxxxxxxxx

9/8 10:15:39 (pid:10205) Called reschedule_negotiator()

9/8 10:15:39 (pid:10205) failed to send RESCHEDULE command to negotiator