[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] SOAP API requests are blocking for a collector daemon



Hi everyone,

I'm trying to work with condor SOAP API. Requests sent to the collector daemon blocks it untill request will be complited. For example

05/06/15 09:24:36 Received HTTP POST connection from <*.*.*.*:42897>
05/06/15 09:24:36 About to serve HTTP request...
05/06/15 09:24:36 Got QUERY_STARTD_ADS
05/06/15 09:24:36 (Sending 547 ads in response to query)
05/06/15 09:25:13 Completed servicing HTTP request
05/06/15 09:25:13 condor_write(): Socket closed when trying to write 319 bytes to <*.*.*.*:14708>, fd is 8
05/06/15 09:25:13 Buf::write(): condor_write() failed
05/06/15 09:25:13 SECMAN: Error sending response classad to <*.*.*.*:14708>!
...
05/06/15 09:25:13 condor_write(): Socket closed when trying to write 319 bytes to <*.*.*.*:20556>, fd is 11
05/06/15 09:25:13 Buf::write(): condor_write() failed
05/06/15 09:25:13 SECMAN: Error sending response classad to <*.*.*.*:20556>!

So daemon didn't handle requests for ~40 secs, also it drops some connections. Otherwise the same request from condor_status is non-blocking:

05/06/15 09:33:32 Got QUERY_STARTD_ADS
05/06/15 09:33:32 Number of Active Workers 0
05/06/15 09:33:32 (Sending 513 ads in response to query)
05/06/15 09:33:32 Query info: matched=513; skipped=0; query_time=0.011775; send_time=0.021067; type=Machine; requirements={true}; peer=<*.*.*.*:13089>; projection={Name Machine Opsys Arch State Activity LoadAvg Memory ActvtyTime MyCurrentTime EnteredCurrentActivity}
05/06/15 09:33:33 Got QUERY_STARTD_ADS
...

The other unpleasant effect is if we interrupt call from the client side, daemon would be blocked for a timeout value or even infinitely (if no timeout has been set). This issue makes API useless, because I should at least ask collector about schedulers ports. Is there any way to avoid this, or plans to fix it in future releases?

Thanks in advance.