[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] [Birdbath Related] Strange behaviour - 6.7.17

That's confusing for two reasons

a) I always install condor with instruction to "always run condor
jobs" and
not when keyboard is idle or etc. By it I understand that my
computer should
always be available to condor; no matter what.
b) I never ask for any particular requirements and send null
instead. By
this I understand that whatever be the case or however meagre the
may be but computers should just execute the jobs.

Above both, if true, imply that a match should always be made.
Since it's
not been the case so either I have a wrong understanding or I am
something elsewhere i.e. may be in the configuration or the code.
Could you
put me right about it, please.

You are setting the requirements to be null? You should look at the
condor_q -l and see what the Requirements expression is for your job.
You can also look at the ScheddLog to see if it reports any errors
when matching your jobs.
I don't know what Axis (I assume you are using Apache Axis) will do
with the null. It is entirely possible that your job is getting into
the queue without a Requirements expression, which could be bad
depending on how the Schedd evaluates things. You could try making
your Requirement attribute equal to "TRUE". That'd likely be better
than null.

a) I am using Apache Axis.

b) The requirement expression for my job is TRUE in both the cases i.e. when I set it to null or to TRUE. In fact it is (TRUE) i.e. with the parentheses in both the cases.

c) condor_q -analyze tells if the job is being rejected due to the jobs requirements or condor/machines' own requirements. However, its none of both in my case. It's actually 'unkonwn reason'. Please have a look below.

d) The SchedLog seems to be showing showing a different behaviour though. Please have a look as I have attached under condor_q output.

Condor_q OUTPUT (for one job only)
009.000: Run analysis summary. Of 2 machines,

0 are rejected by your job's requirements

0 reject your job because of their own requirements

0 match but are serving users with a better priority in the pool

2 match but reject the job for unknown reasons

0 match but will not currently preempt their existing job

0 are available to run your job

SchedLog (the last few lines only)
3/6 18:53:34 (pid:1804) Received HTTP POST connection from <>

3/6 18:53:34 (pid:1804) About to serve HTTP request...

3/6 18:53:34 (pid:1804) Completed servicing HTTP request

3/6 18:53:55 (pid:1804) IO: Failed to read packet header

3/6 18:54:28 (pid:1804) IO: Failed to read packet header

3/6 18:55:27 (pid:1804) ProcAPI sanity failure, age = -98757115