[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Jobs stay idle



Dear all,

Having migrated from SGE to Condor, this is my first time submitting a job with Condor. Starting with examples folder and following online manual and README tutorial, I can submit [program].cmd. However, all jobs stay at idle phase without proceeding. The following outputs might help:

condor_q output:

##
#-- Submitter: eggplant.msrc.local : <192.20.20.22:49720> : eggplant.msrc.local
# ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
#   5.0   arman           1/7  23:27   0+00:01:20 I  0   1.0  bash              
#   8.0   arman           1/8  11:09   0+00:00:00 I  0   0.0  sh_loop 60        

Here is the output for condor_q -analyze job_number:


# -- Submitter: eggplant.msrc.local : <192.20.20.22:49720> : eggplant.msrc.local
#       Last successful match: Wed Jan  8 11:17:46 2014

# The following attributes are missing from the job ClassAd:

#CheckpointPlatform

>From log files, I think the following might help you to help me:

Shodawlog:

#01/08/14 11:18:47 Using config source: /home/arman/condor_configs/condor_config
#01/08/14 11:18:47 Using local config sources:
#01/08/14 11:18:47    /home/arman/condor_configs/hosts/eggplant/condor_config.local
#01/08/14 11:18:47    /opt/HTCondor/local.eggplant/condor_config.local
#01/08/14 11:18:47    /opt/HTCondor/local.eggplant/condor_config.local
#01/08/14 11:18:47 DaemonCore: command socket at <192.20.20.22:33411?noUDP>
#01/08/14 11:18:47 DaemonCore: private command socket at <192.20.20.22:33411>
#01/08/14 11:18:47 ERROR "According to /opt/HTCondor/local.eggplant/spool/spool_version, the #SPOOL directory is written in spool version 0, but I only support versions back to 1.
#" at line 67 in file /slots/03/dir_1684/userdir/src/condor_utils/spool_version.cpp

From manager node, this is the matchlog:

#01/08/14 11:27:04       Matched 5.0 arman@xxxxxxxxxx <192.20.20.22:49720> preempting none #<192.20.20.9:47444> slot1@xxxxxxxxxxxxxxxxxxxxx
#01/08/14 11:27:04       Matched 8.0 arman@xxxxxxxxxx <192.20.20.22:49720> preempting none #<192.20.20.9:47444> slot2@xxxxxxxxxxxxxxxxxxxxx
#01/08/14 11:28:04       Matched 5.0 arman@xxxxxxxxxx <192.20.20.22:49720> preempting none #<192.20.20.9:35591> slot1@xxxxxxxxxxxxxxxxxxxxx
#01/08/14 11:28:04       Matched 8.0 arman@xxxxxxxxxx <192.20.20.22:49720> preempting none #<192.20.20.9:35591> slot2@xxxxxxxxxxxxxxxxxxxxx
#01/08/14 11:29:04       Matched 5.0 arman@xxxxxxxxxx <192.20.20.22:49720> preempting none #<192.20.20.9:35591> slot1@xxxxxxxxxxxxxxxxxxxxx
#01/08/14 11:29:04       Matched 8.0 arman@xxxxxxxxxx <192.20.20.22:49720> preempting none #<192.20.20.9:35591> slot2@xxxxxxxxxxxxxxxxxxxxx

I would very much appreciate any inputs here.

All the best,
Arman