[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor_q hanging with SOAP



I am running Condor version 6.8.4 Feb 1 2007. The command seems to hang
during submission and also when flocking is about to begin (or just before
the job starts to execute) - or so it would seem. Schedlog contains repeat
occurrences of the following - but this because the system has no UID for
User1 - it's a fictitious user. This user is merely being used for test
purposes, however, the data is retrieved successfully.

Received HTTP POST connection from <137.158.59.235:2100>
10/4 13:31:33 (pid:4878) About to serve HTTP request...
10/4 13:31:33 (pid:4878) Completed servicing HTTP request
10/4 13:32:07 (pid:4878) Activity on stashed negotiator socket
10/4 13:32:07 (pid:4878) Negotiating for owner: user1@xxxxxxxxxxxxxxxxxxxx
10/4 13:32:07 (pid:4878) Checking consistency running and runnable jobs
10/4 13:32:07 (pid:4878) Tables are consistent
10/4 13:32:08 (pid:4878) Out of servers - 18 jobs matched, 18 jobs idle, 1
jobs rejected
10/4 13:32:08 (pid:4878) Increasing flock level for
user1@xxxxxxxxxxxxxxxxxxxx to 2.
10/4 13:32:08 (pid:4878) Activity on stashed negotiator socket
10/4 13:32:08 (pid:4878) Negotiating for owner: user1@xxxxxxxxxxxxxxxxxxxx
10/4 13:32:08 (pid:4878) Checking consistency running and runnable jobs
10/4 13:32:08 (pid:4878) Tables are consistent
10/4 13:32:08 (pid:4878) Out of servers - 14 jobs matched, 22 jobs idle, 1
jobs rejected
10/4 13:32:15 (pid:4878) passwd_cache::cache_uid(): getpwnam("user1")
failed: user not found
10/4 13:32:15 (pid:4878) (36.28) Failed to find UID and GID for user user1.
Cannot chown /opt/condor/home/spool/c
luster36.proc28.subproc0 to user. Job may run into permissions problems when
it starts.
10/4 13:32:15 (pid:4878) Starting add_shadow_birthdate(36.28)
10/4 13:32:15 (pid:4878) Started shadow for job 36.28 on
"<137.158.32.5:49555>", (shadow pid = 18455)
10/4 13:32:18 (pid:4878) passwd_cache::cache_uid(): getpwnam("user1")
failed: user not found
10/4 13:32:18 (pid:4878) (36.17) Failed to find UID and GID for user user1.
Cannot chown /opt/condor/home/spool/c

Thanks

Christopher


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Matthew Farrellee
Sent: Thursday, October 04, 2007 3:55 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Condor_q hanging with SOAP

Condor is single-threaded and in older versions during submission 
(either via SOAP or condor_submit) the Schedd would essentially be 
blocked, which will keep condor_q from returning quickly. However, this 
problem is supposed to be fixed (at least wrt SOAP submission) in 6.8 
and 6.9.

Can you provide more information?
  - version of Condor?
  - "hang" happens only during the submission?
  - SchedLog mention anything about delaying connection? (might need 
D_FULLDEBUG)


matt

Christopher Parker wrote:
> 
> 
> Hi,
> 
>  
> 
> When I submit a job to condor using the SOAP interface, the condor_q 
> command seems to "hang". This happens for about a minute or two and then 
> responds normally. I have read the mailing list, but it seems those 
> problems had to do with NFS mounts. Is there some other reason for this? 
> The condor home directory is mounted on a separate partition on the same 
> drive with a symbolic link pointing to it.
> 
>  
> 
> Thanks in advance
> 
>  
> 
> Christopher
> 
>  
> 
> --------------------------------------
> 
> Christopher Parker (BSc. Hons)
> 
> Department of Computer Science
> 
> High Performance Computing Laboratory
> 
> University of Cape Town
> 
> http://people.cs.uct.ac.za/~cparker/
> 
> --------------------------------------
> 
>  
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at: 
> https://lists.cs.wisc.edu/archive/condor-users/
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: 
https://lists.cs.wisc.edu/archive/condor-users/