[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] "condor_q -format ... Requirements" only prints single queue element



On Aug 1, 2012, at 11:18 AM, Martin Koniczek wrote:

> On Jul 31, 2012, Jaime Frey wrote:
>> On Jul 25, 2012, at 12:08 PM, Martin Koniczek wrote:
>>> We just upgraded from condor 7.2.4 (as shipped with ubuntu10.04) to
>>> to condor-7.8.1-43996-deb_6.0_amd64.deb (ubuntu12.04)
>>> and now the previously working command:
>>> 
>>> condor_q -format "%s\n" Requirements
>>> 
>>> only returns exactly one "record", regardless of the number of elements
>>> in the queue. Asking for other attributes works as expected. Can anybody
>>> confirm this behavior?
>> 
>> I just tried this exact setup (condor-7.8.1-43996-deb_6.0_amd64.deb
> installed on Ubuntu >12.04) and it works as expected (one line for each
> job in the queue). I don't know what's >going wrong on your machine.
> 
> Our machine in question is running 12.04 server 64bit, but I don't think
> that is the problem.
> To make sure, I just installed condor7.8.1 in a vanilla Ubuntu 12.04
> 64bit Desktop system,
> did not change any configuration file and tried the simple two-job
> example described at the end of this email - with the same malformed
> outcome as on our production 12.04 server.
> This makes me wonder why you were not able to reproduce the issue - did
> your 12.04 system have any libraries or packages installed which are not
> listed as requirements in the condor-7.8.1-43996-deb_6.0_amd64.deb? What
> else could be different?
> 
> 
>> Check the exit code of condor_q to see if it's terminating early due to
> some error.
>> You can also look at the log of the schedd for any strange messages.
>> Also, try removing the '%s' corresponding to Requirements and see if
> you get one line or >all the expected lines.
> 
> Exit code is always zero, and replacing %s with x prints one x for each
> queue element. But trailing SchedLog reveals a little more, here the
> actual output:
> 
> $ condor_q -format "%s\n" Requirements && echo "ok"
> ( TARGET.FileSystemDomain == "hellboyraids" && TARGET.UidDomain ==
> "cirsims" && TARGET.Machine == "sky036" ) && ( TARGET.Arch == "X86_64" )
> && ( TARGET.OpSys == "LINUX" ) && ( TARGET.Disk >= RequestDisk ) && (
> TARGET.Memory >= RequestMemory )
> ok
> 
> While in SchedLog:
> 08/01/12 11:30:00 (pid:30667) Number of Active Workers 1
> 08/01/12 11:30:00 (pid:18808) Number of Active Workers 0
> 08/01/12 11:30:00 (pid:18808) can't parse constraint: "Req
> 08/01/12 11:30:00 (pid:18808) can't parse constraint: "Req
> ..... (message repeated for a total of exactly 121 occurrences)
> 
> 
> $ condor_q -format "x" Requirements && echo "ok"
> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxok
> 
> While in SchedLog:
> 08/01/12 11:31:13 (pid:30667) Number of Active Workers 1
> 08/01/12 11:31:13 (pid:19874) Number of Active Workers 0
> .... (note that there are exactly 122 "x" above)
> 
> One correct line of output and 121 "can't parse" log entries with %s,
> and 122 "x" if not using "%s" (note that the queue had 122 elements)...
> ...some counter or string in the parser not reset properly or something
> similar,
> but only if the "Requirements" contain some magic sauce?
> 
> To track down the magic sauce, I cleared the queue, and submitted two
> very simple jobs without extra requirements:
> Executable = /bin/sleep
> arguments = $(Process) 365d
> Universe = vanilla
> Log = condor.log
> Queue 2
> 
> But the behavior is still the same:
> $ condor_q -format "%s\n" Requirements && echo "exit code 0"
> ( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" ) && (
> TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) && (
> ( TARGET.HasFileTransfer ) || ( TARGET.FileSystemDomain ==
> MY.FileSystemDomain ) )
> exit code 0
> 
> $ condor_q -format "x\n" Requirements && echo "exit code 0"
> x
> x
> exit code 0
> 
> I hope we can track down the problem. We have some scripts which heavily
> rely on the correct behavior of this command, and I have the feeling
> there could be more to this problem than just a mangled output.


Odd. Can you try adding this line to your config file:
   SCHEDD_DEBUG = D_SYSCALLS
Then run condor_reconfig and try another condor_q.

Then send me the resulting SchedLog. You can send it to me directly.

Thanks and regards,
Jaime Frey
UW-Madison Condor Team