[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Problem Condor Job Stays Idle Because of target.CkptArch



hello all,
i am submitting job through globus to condor but the job stays in idle state. the job details are as follows.
================================================
The Job Description Generated by GRAM is as follows

[condor@niting-w2p etc]$ cat /tmp/condor_job_description
#
# description file for condor submission
#
Universe = standard
Notification = Never
Executable = /home/psegrid/NIP/nip
Requirements = OpSys == "LINUX"  && Arch == "INTEL"
Environment = GLOBUS_LOCATION=/usr/local/globus-4.0.5/;X509_CERT_DIR=/etc/grid-security/certificates;X509_USER_PROXY=;X509_USER_CERT=;X509_USER_KEY=;HOME=/home/psegrid;LOGNAME=psegrid;SCRATCH_DIRECTORY=/home/psegrid/.globus/scratch;JAVA_HOME=/usr/java/jdk1.6.0_03/jre;GLOBUS_GRAM_JOB_HANDLE= https://192.168.7.221:8443/wsrf/services/ManagedExecutableJobService?7f408200-9789-11dc-9f1a-b41f06e1e2ea;LD_LIBRARY_PATH=
Arguments =
InitialDir = /home/psegrid
Input = /dev/null
Log = /usr/local/globus-4.0.5//var/globus-condor.log
log_xml = True
#Extra attributes specified by client

Output = /home/psegrid/stdout
Error = /home/psegrid/stderr
queue 1
 
=======================================================================
[psegrid@niting-w2p NIP]$ condor_q -better-analyze


-- Submitter: niting-w2p.corp.cdac.in : <192.168.7.221:42993> : niting-w2p.corp.cdac.in
---
005.000:  Run analysis summary.  Of 7 machines,
     4 are rejected by your job's requirements
     0 reject your job because of their own requirements
     0 match but are serving users with a better priority in the pool
     3 match but reject the job for unknown reasons
     0 match but will not currently preempt their existing job
     0 are available to run your job
       Last successful match: Tue Nov 20 22:36:21 2007

The Requirements _expression_ for your job is:

( target.OpSys == "LINUX" && target.Arch == "INTEL" ) &&
( ( target.CkptArch == target.Arch ) || ( target.CkptArch is undefined ) ) &&
( ( target.CkptOpSys == target.OpSys ) || ( target.CkptOpSys is undefined ) ) &&
( target.Disk >= DiskUsage ) && ( ( target.Memory * 1024 ) >= ImageSize )

   Condition                         Machines Matched    Suggestion
   ---------                         ----------------    ----------
1   target.Arch == "INTEL"            3                    
2   target.OpSys == "LINUX"           7                    
3   ( ( target.CkptArch == target.Arch ) || ( target.CkptArch is undefined ) )
                                     7                    
4   ( ( target.CkptOpSys == target.OpSys ) || ( target.CkptOpSys is undefined ) )
                                     7                    
5   ( target.Disk >= 20000 )          7                    
6   ( ( 1024 * target.Memory ) >= 20000 )7        




==========================================================
[psegrid@niting-w2p NIP]$ condor_status

Name          OpSys       Arch   State      Activity   LoadAv Mem   ActvtyTime

vm1@niting-w2 LINUX       INTEL  Unclaimed  Idle       0.000   469  0+00:05:26
vm2@niting-w2 LINUX       INTEL  Unclaimed  Idle       0.140   469  0+00:26:42
sskadam-w2p.c LINUX       INTEL  Unclaimed  Idle       0.000   248  0+00:44:38
vm1@psewebs-w LINUX       X86_64 Unclaimed  Idle       0.400   753  0+00:30:04
vm2@psewebs-w LINUX       X86_64 Unclaimed  Idle       0.000   753  0+00:30:05
vm3@psewebs-w LINUX       X86_64 Unclaimed  Idle       0.000   753  0+00:30:06
vm4@psewebs-w LINUX       X86_64 Unclaimed  Idle       0.000   753  0+00:30:27

                    Total Owner Claimed Unclaimed Matched Preempting Backfill

        INTEL/LINUX     3     0       0         3       0          0        0
       X86_64/LINUX     4     0       0         4       0          0        0

              Total     7     0       0         7       0          0        0
==============================================================
The DAEMON details for all three machines are as follows

[condor@niting-w2p etc]$ ./test.sh
current file: condor_config
##  checkpoint server isn't available or USE_CKPT_SERVER is set to
USE_CKPT_SERVER = True
CKPT_SERVER_HOST        = psewebs-w2p.corp.cdac.in
##  checkpoint server?  If False, the CKPT_SERVER_HOST set on
##  the submit machine is used.  Otherwise, the CKPT_SERVER_HOST set
STARTER_CHOOSES_CKPT_SERVER = True
#WALL_CLOCK_CKPT_INTERVAL = 3600
##  setting is only used if USE_CKPT_SERVER (from above) is True.
#COMPRESS_PERIODIC_CKPT = False
#COMPRESS_VACATE_CKPT = False
#SLOW_CKPT_SPEED = 0
DAEMON_LIST                     = MASTER, STARTD, SCHEDD
#DC_DAEMON_LIST = \
=============
current file: psewebs-w2p.local
USE_CKPT_SERVER = True
CKPT_SERVER_HOST        = psewebs-w2p.corp.cdac.in
DAEMON_LIST = MASTER, STARTD, SCHEDD
DAEMON_LIST   = MASTER, COLLECTOR, NEGOTIATOR, STARTD, SCHEDD
=============
current file: niting-w2p.local
USE_CKPT_SERVER = True
CKPT_SERVER_HOST        = psewebs-w2p.corp.cdac.in
DAEMON_LIST = MASTER, STARTD, SCHEDD
=============
current file: sskadam-w2p.local
USE_CKPT_SERVER = True
CKPT_SERVER_HOST        = psewebs-w2p.corp.cdac.in
DAEMON_LIST = MASTER, STARTD, SCHEDD
===============================

Please Tell what is wrong with job submission.
Thank you.
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nitin M. Gavhane
MS in Adavanced Software Technologies
International Institute of Information Technology
P-14,Hinjewadi,Pune, India.
---------------------------------------------------------------------------------------------------------------------------