[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] condor job never running



Hi all,
    I've been trying to stand up condor servers on Tru64 machines to no
avail. Some of the issues I've found similar mention of elsewhere on
lists, but no resolutions. This behavior is seen with Condor-6.7.18 and
GT4. When the job is submitted it becomes queued but never runs.

The CONDOR_HOST is set to the hostname of the 128.182.112.71 interface,
and the 128.182.112.71 interface is selected by condor via the
NETWORK_INTERFACE directive.

Any ideas why jobs are not being delivered to the globus job-manager?
(globusrun and globus-job-run work).

HOSTALLOW_ADMINISTRATOR = $(CONDOR_HOST)
HOSTALLOW_READ = *
HOSTALLOW_WRITE = $(CONDOR_HOST), $(GLIDEIN_SITES) , $(FULL_CLUSTER)

All HOSTDENY directives are commented out.

Output from condor_q...


-- Submitter: iam763 : <128.182.112.71:9288> : iam763
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE
CMD              
  12.0   shelmire        5/10 14:39   0+00:00:00 I  0   0.0 
hostname         

1 jobs; 1 idle, 0 running, 0 held

below is the SchedLog information...

5/10 14:27:23 DaemonCore: Command received via UDP from host
<128.182.112.71:56455>
5/10 14:27:23 DaemonCore: received command 60000 (DC_RAISESIGNAL),
calling handler (HandleSigCommand())
5/10 14:27:23 Got SIGTERM. Performing graceful shutdown.
5/10 14:27:23 Deleting CronMgr
5/10 14:27:23 Cleaning job queue...
5/10 14:27:23 All shadows are gone, exiting.
5/10 14:27:23 **** condor_schedd (condor_SCHEDD) EXITING WITH STATUS 0
5/10 14:27:34 ******************************************************
5/10 14:27:34 ** condor_schedd (CONDOR_SCHEDD) STARTING UP
5/10 14:27:34 ** $CondorVersion: 6.7.18 Mar 22 2006 $
5/10 14:27:34 ** $CondorPlatform: ALPHA-DUX5 $
5/10 14:27:34 ** PID = 674279
5/10 14:27:34 ******************************************************
5/10 14:27:34 Using config file:
/usr/local/packages/tg/condor-g-6.7.18-r1/etc/condor_config
5/10 14:27:34 DaemonCore: Command Socket at <128.182.112.71:9288>
5/10 14:27:34 ERROR: Unable to find collector info in configuration file!!!
5/10 14:27:34 History file rotation is enabled.
5/10 14:27:34   Maximum history file size is: 20971520 bytes
5/10 14:27:34   Number of rotated history files is: 2
5/10 14:38:39 IO: Failed to read packet header
5/10 14:39:09 IO: Failed to read packet header
5/10 14:39:09 DaemonCore: Command received via UDP from host
<128.182.112.71:56457>
5/10 14:39:09 DaemonCore: received command 421 (RESCHEDULE), calling
handler (reschedule_negotiator)
5/10 14:39:09 Sent ad to central manager for shelmire@xxxxxxxxxxxxxxxx
5/10 14:39:09 Sent ad to 0 collectors for shelmire@xxxxxxxxxxxxxxxx
5/10 14:39:09 ERROR - gridmanager way too old!
5/10 14:39:09 Called reschedule_negotiator()
5/10 14:39:09 ERROR: Unable to find collector info in configuration file!!!
5/10 14:39:09 failed to send RESCHEDULE command to negotiator
5/10 14:39:39 IO: Failed to read packet header