[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Fix for the possibly bogus "ceiling(ifThenElse(JobVMMemory ..." requirement issue



Have recently looked ta deploying a 7.6.3 Condor to some win7
boxes.

Whist the Condor compute nodes seem to start up fine, removing
some problems that had been seen before on the win7 platform,
even the simplest of jobs, pass a .BAT file over and run it,
seem to fail to find  a host, because, so condor_q -better_analyze
suggests of this:

    Condition                         Machines Matched    Suggestion
    ---------                         ----------------    ----------
1   ( ( 1024 * ceiling(ifThenElse(JobVMMemory isnt
undefined,JobVMMemory,9.765625000000000E-04)) ) >= 1 )
                                      0                   REMOVE

Looking around the interweb thing for

condor ceiling ifThenElse JobVMMemory

suggests there does not seem to be a clear answer out there as yet.

I have seen suggestions that it could be windows firewall related
(for a memory calculation?), but they make no mention as to what
might not be getting firewalled.

I saw a RedHat Cumin advisory that metioned the issue but ends
up linking to an errata that seems to deal with a "log broker
authentication" attack vector, so, again nothing to do with memory.

Furthermore, the links I have followed all seem to suggest that
no-one actually knows how that condition comes into the requirements
equation in the  first place?

The odd thing for me is this, a not-quite working Condor 7.4.4
on a win 7 host, did allow me to run the same test and, as I
had copied back the job ad as part of the test, I can see that
this requirement was in there

RequestMemory = ceiling(ifThenElse(JobVMMemory =!= UNDEFINED, JobVMMemory,
ImageSize / 1024.000000))

Now I come to run the same submission to a 7.6.3, win 7 host and
I can't seemingly meet the requirements ?

The job submission file looks like this

-----8<-------------8<-------------8<-------------8<-------------8<--------
##########

universe = vanilla
environment = path=c:\WINDOWS\SYSTEM32
executable = pokearnd-win7.bat
TransferInputFiles  =
arguments  =
output     = pokearnd.out.$(Cluster).$(Process)
error      = pokearnd.err.$(Cluster).$(Process)
log        = pokearnd.log.$(Cluster).$(Process)
Requirements = (OpSys == "WINNT61") && (Machine == "somemachine.vuw.ac.nz" )
ShouldTransferFiles  = YES
WhenToTransferOutput = ON_EXIT
queue 1
-----8<-------------8<-------------8<-------------8<-------------8<--------

The master, UNIX, is running condor-7.4.4

One more piece of info.

following another link I decided to read, I added a (Memory > 766) to the
Requirements in the above and, although the job is not assigned to the
targetted machine, a condor status sees three of the four slots on the
machine listed as "Matching"

slot1@somemachine  WINNT61    INTEL  Matched   Idle     0.000   767 
0+00:00:04
slot2@somemachine  WINNT61    INTEL  Matched   Idle     0.000   767 
0+00:00:05
slot3@somemachine  WINNT61    INTEL  Matched   Idle     0.000   767 
0+00:00:06
slot4@somemachine  WINNT61    INTEL  Unclaimed Idle     0.000   767 
0+00:00:07

Most intersted to hear what folk who might know think is actually
going on here,
Kevin

-- 
Kevin M. Buckley                                  Room:  CO327
School of Engineering and                         Phone: +64 4 463 5971
 Computer Science
Victoria University of Wellington
New Zealand