[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] AddConstraint: Condition value not literal



Hi all

I have some jobs that just don't start running, and I really don't undestand why. I can see 2 strange things in the "condor_q -better-analyze" result that I need to be explanied. They might be the reason (hopefully...), but I can't "decypher" them...

This is the simple MPI example : 

######################################
## Parallel example submit description file
## using a shared file system
######################################
universe = parallel
executable = /bin/cat
log = logfile
input = infile.$(NODE)
output = outfile.$(NODE)
error = errfile.$(NODE)
machine_count = 4
should_transfer_files = yes
when_to_transfer_output = on_exit
queue
########################

$ condor_q -better-analyze


-- Submitter: seurat.my.domain : <172.XX.XX.XX:32857> : seurat.my.domain
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
---
031.000:  Run analysis summary.  Of 51 machines,
     24 are rejected by your job's requirements
      9 reject your job because of their own requirements
      0 match but are serving users with a better priority in the pool
     18 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job

The Requirements expression for your job is:

( target.Arch == "INTEL" ) && ( target.OpSys == "LINUX" ) &&
( target.Disk >= DiskUsage ) && ( ( target.Memory * 1024 ) >= ImageSize ) &&
( target.HasFileTransfer )

    Condition                         Machines Matched    Suggestion
    ---------                         ----------------    ----------
1   ( target.Arch == "INTEL" )        27
2   ( target.OpSys == "LINUX" )       51
3   ( target.Disk >= 10000 )          51
4   ( ( 1024 * target.Memory ) >= 10000 )51
5   ( target.HasFileTransfer )        51

The following attributes are missing from the job ClassAd:

CheckpointPlatform
##############################"


So here are the 2 errors I don't understand : 
- What does "AddConstraint: Condition value not literal" mean ? It seems to be a good reason for 18 resource to reject my job (why would the others accept it ? or the others maybe already have rejected it for other reasons ?)

- I don't understand the "The following attributes are missing from the job ClassAd:CheckpointPlatform"
I'm not using any checkpointing for this job, which is not "condor_compiled"


Is all that a good reason so that my job won't be started ?

I can give you all the files you need (scheddLogs, MasterLog, with fulldebug, config files, etc...), just ask, but I don't know where to search for now....

Thanks in advance
Nicolas





----------------------------------------------------
CNRS - UPR 9080 : Laboratoire de Biochimie Theorique
Institut de Biologie Physico-Chimique
13 rue Pierre et Marie Curie
75005 PARIS - FRANCE

Tel : +33 158 41 51 70
Fax : +33 158 41 50 26
----------------------------------------------------