[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] [Birdbath Related] Strange behaviour - 6.7.17



On Mar 5, 2006, at 10:14 AM, Afrasyab Bashir wrote:

[snip]

I added code to catch exceptions on about every line as per your
instructions. To my utter surprise the code  has automatically started
functioning. Transactions are being carried out and files are being sent
with the same code. On one hand i'm happy but on the other its very
frustrating because I can't understand what caused the problem. Anyway, I
have some more queries please.

This is troubling. I was suggesting you change how errors are reported so that you can see why things are actually failing. I never would expect something to all of a sudden start working...


Strange Behaviour (it might not be strange for others though).

[ I've used the term 'personal computer', in this, for the computer I'm
using /typing on / interacting on.]

1. When I have only 6.7.17 installed on both the computers that i'm using for my condor pool and I start the mini embedded SOAP web server on the remote computer then condor_status -l does return only the machine that is running central manager / master. It does not matter that on which computer
I run the condor_status -l query.

condor_status will contact the central manager by default. It is possible that if your central manager started after your personal computer you will have to wait 5 minutes to see your personal computer in the list.


2. When I install 6.6.9 on personal computer and 6.7.17 on remote computer and submit job from 6.6.9 computer (using birdbath) on to 6.7.17 computer then I can not see the job in queue. To check the queue I use condor_q but
queue is empty with no jobs in, none idle or etc. Therefore, I remain
unaware if my job was submitted or not.

That would be because the jobs are in the 6.7.17 schedd's queue, and condor_q, by default, looks only at your local schedd.


3. When I install 6.7.17 on both the computers and install SOAP server on my personal computer and then submit the job using birdbath then the job can
be seen idle in the queue and remains as such forever.

You can see them because they are in the queue that condor_q looks at by default. As for being there forever, you might not have a machine available that they match.


condor_q -analyze says, " 2 match but reject the job for unknown reasons" condor_status -java returns empty string. However path is correct in both the machines plus the fact that on command prompt job exits gracefully after
execution.

You should look at the Requirements attribute of your job and see if you can figure out why the match might be failing -- maybe both machines in your pool think they have an active user?

It sounds like a configuration issue here, which is good in the sense that you have things working with the SOAP API...


4. When I submit the job as a user whose credentials are not stored on the computer then I can't remove it from the queue on command line even when I'm
trying as the administrator.

You should repost this separately as a general issue.


5. The jobs that were submitted, as a user with stored credentials, can be marked for removal on command prompt however condor_q keeps displaying them as jobs marked for removal and condor_q -analyze says , "Request is removed"
but does show the id and this remark.

Jobs submitted with the SOAP API have their LeaveJobInQueue attribute set (or some similar name). While it evaluates to TRUE the job will sit in the X (removed) state in the queue. You can use condor_q_edit to change the attribute, or the CloseSpool() SOAP call. Also, condor_rm -forcex might work too.


Birdbath Specific Queries

6. condor_q shows all the jobs and related details within a job queue.
What's the substitute for it in birdbath? Reason of question is that it seems that I have to metion transaction, clusterId, jobId etc to retrieve the information. However, there is no such restriction in condor_q. What if
I want to manage the jobs with birdbath?

In addition to GetJobAd(), there is a GetJobAds() call that takes a classad expression and returns matching jobs, e.g. the expression "TRUE" would give you everything. You can pass null (in Java) for the transaction.


7. Can I mention a requirement as "Machine = "\marie-LAPTOP\"" for the job
to run on that particular computer?

You sure can, but you might need an "==".



matt