[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] STARTD_AD_REEVAL_EXPR message in NegotiatorLog




I recently discovered this bug too. There is a fix that will be released with condor 7.0.5 and 7.1.3.

The consequence of the bug is that the CurMatches attribute is reset to 0 in some cases before a new update to the site classad is published.

--Dan

Warren Smith wrote:

Hi, I'm working on doing some matchmaking with Condor_G (Condor 7.1.0). I get errors in the NegotiatorLog such as:

9/2 10:34:01 ---------- Started Negotiation Cycle ----------
9/2 10:34:01 Phase 1:  Obtaining ads from collector ...
9/2 10:34:01   Getting all public ads ...
9/2 10:34:01   Sorting 38 ads ...
9/2 10:34:01 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE
...
9/2 10:34:02 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE
9/2 10:34:02   Getting startd private ads ...
9/2 10:34:02 Got ads: 38 public and 0 private
9/2 10:34:02 Public ads include 1 submitter, 33 startd

This error doesn't seem to be affecting anything (classads are getting updated), but I thought I'd double check since my web searching didn't really turn up anything.

I get the message for, what looks like, each of the classads I inserted into condor with condor_advertise. Here is an example class ad:

lslogin2$ condor_status -l tacc.lonestar.serial
MyType = "Machine"
TargetType = "Job"
Requirements = (TARGET.JobUniverse == 9)
Rank = 0.000000
CurrentRank = 0.000000
WantAdRevaluate = TRUE
CurMatches = 0
Name = "tacc.lonestar.serial"
Machine = "gatekeeper.lonestar.tacc.teragrid.org"
StartdIpAddr = "<129.114.50.32>"
GridResource = "gt2 gatekeeper.lonestar.tacc.teragrid.org:2119/jobmanager-lsf"
State = "Unclaimed"
Activity = "Idle"
UpdateSequenceNumber = 1220367368
Arch = "X86_64"
OpSys = "LINUX"
LoadAvg = 0.865580
TotalMemory = 11840721
Memory = 1725537
Queue = "serial"
Priority = 0.030000
MaxWallTime = 720
MaxProcessors = 1
MyAddress = "<192.5.198.172:0>"
LastHeardFrom = 1220367369
UpdatesTotal = 1328
UpdatesSequenced = 0
UpdatesLost = 0
UpdatesHistory = "0x00000000000000000000000000000000"


I'm setting the UpdateSequenceNumber using Unix time(). I did try to temporarily change this to be just the last 5 digits of the current time and I got the same error.

Thanks for the help,


Warren

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/