[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] "condor_q -format ... Requirements" only prints single queue element



On Jul 31, 2012, Jaime Frey wrote:
>On Jul 25, 2012, at 12:08 PM, Martin Koniczek wrote:
>> We just upgraded from condor 7.2.4 (as shipped with ubuntu10.04) to
>> to condor-7.8.1-43996-deb_6.0_amd64.deb (ubuntu12.04)
>> and now the previously working command:
>>
>> condor_q -format "%s\n" Requirements
>>
>> only returns exactly one "record", regardless of the number of elements
>> in the queue. Asking for other attributes works as expected. Can anybody
>> confirm this behavior?
>
>I just tried this exact setup (condor-7.8.1-43996-deb_6.0_amd64.deb
installed on Ubuntu >12.04) and it works as expected (one line for each
job in the queue). I don't know what's >going wrong on your machine.

Our machine in question is running 12.04 server 64bit, but I don't think
that is the problem.
To make sure, I just installed condor7.8.1 in a vanilla Ubuntu 12.04
64bit Desktop system,
did not change any configuration file and tried the simple two-job
example described at the end of this email - with the same malformed
outcome as on our production 12.04 server.
This makes me wonder why you were not able to reproduce the issue - did
your 12.04 system have any libraries or packages installed which are not
listed as requirements in the condor-7.8.1-43996-deb_6.0_amd64.deb? What
else could be different?


>Check the exit code of condor_q to see if it's terminating early due to
some error.
>You can also look at the log of the schedd for any strange messages.
>Also, try removing the '%s' corresponding to Requirements and see if
you get one line or >all the expected lines.

Exit code is always zero, and replacing %s with x prints one x for each
queue element. But trailing SchedLog reveals a little more, here the
actual output:

$ condor_q -format "%s\n" Requirements && echo "ok"
( TARGET.FileSystemDomain == "hellboyraids" && TARGET.UidDomain ==
"cirsims" && TARGET.Machine == "sky036" ) && ( TARGET.Arch == "X86_64" )
&& ( TARGET.OpSys == "LINUX" ) && ( TARGET.Disk >= RequestDisk ) && (
TARGET.Memory >= RequestMemory )
ok

While in SchedLog:
08/01/12 11:30:00 (pid:30667) Number of Active Workers 1
08/01/12 11:30:00 (pid:18808) Number of Active Workers 0
08/01/12 11:30:00 (pid:18808) can't parse constraint: "Req
08/01/12 11:30:00 (pid:18808) can't parse constraint: "Req
..... (message repeated for a total of exactly 121 occurrences)


$ condor_q -format "x" Requirements && echo "ok"
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxok

While in SchedLog:
08/01/12 11:31:13 (pid:30667) Number of Active Workers 1
08/01/12 11:31:13 (pid:19874) Number of Active Workers 0
.... (note that there are exactly 122 "x" above)

One correct line of output and 121 "can't parse" log entries with %s,
and 122 "x" if not using "%s" (note that the queue had 122 elements)...
...some counter or string in the parser not reset properly or something
similar,
but only if the "Requirements" contain some magic sauce?

To track down the magic sauce, I cleared the queue, and submitted two
very simple jobs without extra requirements:
Executable = /bin/sleep
arguments = $(Process) 365d
Universe = vanilla
Log = condor.log
Queue 2

But the behavior is still the same:
$ condor_q -format "%s\n" Requirements && echo "exit code 0"
( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" ) && (
TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) && (
( TARGET.HasFileTransfer ) || ( TARGET.FileSystemDomain ==
MY.FileSystemDomain ) )
exit code 0

$ condor_q -format "x\n" Requirements && echo "exit code 0"
x
x
exit code 0

I hope we can track down the problem. We have some scripts which heavily
rely on the correct behavior of this command, and I have the feeling
there could be more to this problem than just a mangled output.

Regards,
	Martin