[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor_q hanging with SOAP



Where is the command handing in this log? I would expect to see a message about the command being deferred during the HTTP requests, and then it being serviced after.

The points where you say the command seems to hang are times when data transfer might be occurring, thus partially blocking the Schedd.


matt

Christopher Parker wrote:
I am running Condor version 6.8.4 Feb 1 2007. The command seems to hang
during submission and also when flocking is about to begin (or just before
the job starts to execute) - or so it would seem. Schedlog contains repeat
occurrences of the following - but this because the system has no UID for
User1 - it's a fictitious user. This user is merely being used for test
purposes, however, the data is retrieved successfully.

Received HTTP POST connection from <137.158.59.235:2100>
10/4 13:31:33 (pid:4878) About to serve HTTP request...
10/4 13:31:33 (pid:4878) Completed servicing HTTP request
10/4 13:32:07 (pid:4878) Activity on stashed negotiator socket
10/4 13:32:07 (pid:4878) Negotiating for owner: user1@xxxxxxxxxxxxxxxxxxxx
10/4 13:32:07 (pid:4878) Checking consistency running and runnable jobs
10/4 13:32:07 (pid:4878) Tables are consistent
10/4 13:32:08 (pid:4878) Out of servers - 18 jobs matched, 18 jobs idle, 1
jobs rejected
10/4 13:32:08 (pid:4878) Increasing flock level for
user1@xxxxxxxxxxxxxxxxxxxx to 2.
10/4 13:32:08 (pid:4878) Activity on stashed negotiator socket
10/4 13:32:08 (pid:4878) Negotiating for owner: user1@xxxxxxxxxxxxxxxxxxxx
10/4 13:32:08 (pid:4878) Checking consistency running and runnable jobs
10/4 13:32:08 (pid:4878) Tables are consistent
10/4 13:32:08 (pid:4878) Out of servers - 14 jobs matched, 22 jobs idle, 1
jobs rejected
10/4 13:32:15 (pid:4878) passwd_cache::cache_uid(): getpwnam("user1")
failed: user not found
10/4 13:32:15 (pid:4878) (36.28) Failed to find UID and GID for user user1.
Cannot chown /opt/condor/home/spool/c
luster36.proc28.subproc0 to user. Job may run into permissions problems when
it starts.
10/4 13:32:15 (pid:4878) Starting add_shadow_birthdate(36.28)
10/4 13:32:15 (pid:4878) Started shadow for job 36.28 on
"<137.158.32.5:49555>", (shadow pid = 18455)
10/4 13:32:18 (pid:4878) passwd_cache::cache_uid(): getpwnam("user1")
failed: user not found
10/4 13:32:18 (pid:4878) (36.17) Failed to find UID and GID for user user1.
Cannot chown /opt/condor/home/spool/c

Thanks

Christopher


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Matthew Farrellee
Sent: Thursday, October 04, 2007 3:55 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Condor_q hanging with SOAP

Condor is single-threaded and in older versions during submission (either via SOAP or condor_submit) the Schedd would essentially be blocked, which will keep condor_q from returning quickly. However, this problem is supposed to be fixed (at least wrt SOAP submission) in 6.8 and 6.9.

Can you provide more information?
  - version of Condor?
  - "hang" happens only during the submission?
- SchedLog mention anything about delaying connection? (might need D_FULLDEBUG)


matt

Christopher Parker wrote:

Hi,

When I submit a job to condor using the SOAP interface, the condor_q command seems to "hang". This happens for about a minute or two and then responds normally. I have read the mailing list, but it seems those problems had to do with NFS mounts. Is there some other reason for this? The condor home directory is mounted on a separate partition on the same drive with a symbolic link pointing to it.

Thanks in advance

Christopher

--------------------------------------

Christopher Parker (BSc. Hons)

Department of Computer Science

High Performance Computing Laboratory

University of Cape Town

http://people.cs.uct.ac.za/~cparker/

--------------------------------------


------------------------------------------------------------------------

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/