[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor 7.1.0 and condor_config_val




This is indeed a bug introduced in 7.0.0. Previously, the client was sending extra junk at the end of the command that was ignored by the server. In 7.0.0, we caught a bunch of cases like this where the server was not correctly checking for and reading the end of the command, but unfortunately, we did not notice that this specific command was actually not terminating the command correctly on the client side. I'll fix this now, but I may be too late for 7.0.2, which is nearly released.

--Dan

Daniel Forrest wrote:

I just installed Condor 7.1.0 and I am having a problem with
condor_config_val.  I am trying this command:

condor_config_val -schedd -set MAX_JOBS_RUNNING=4000

This worked fine with Condor 6.8.1.


The SchedLog shows this:

6/4 08:53:07 (fd:16) (pid:205850) DaemonCore received UNAUTHENTICATED command 60002.
6/4 08:53:07 (fd:16) (pid:205850) DaemonCore: Command received via TCP from host <192.168.0.149:44575>, access level ALLOW
6/4 08:53:07 (fd:16) (pid:205850) DaemonCore: received command 60002 (DC_CONFIG_PERSIST), calling handler (handle_config())
6/4 08:53:07 (fd:16) (pid:205850) Calling HandleReq <handle_config()> (0)
6/4 08:53:07 (fd:16) (pid:205850) Failed to read end of message from <192.168.0.149:44575>.
6/4 08:53:07 (fd:16) (pid:205850) handle_config: failed to read end of message
6/4 08:53:07 (fd:16) (pid:205850) Return from HandleReq <handle_config()>
6/4 08:53:07 (fd:16) (pid:205850) CLOSE <192.168.0.149:47089> fd=14


It doesn't look like "-debug" works with condor_config_val, but using
strace to capture some stuff gives me:

sendto(3, "\1\0\0\0000\0\0\0\0\0\0\352bmax_jobs_running\0MAX_JOBS_RUNNING=4000\0\n", 53, 0, NULL, 0) = 53
recvfrom(3, "", 5, 0, NULL, NULL)       = 0
write(4, "condor_read(): Socket closed when trying to read 5 bytes from <192.168.0.149:470"..., 84) = 84
write(4, "IO: EOF reading packet header\n", 30) = 30
write(4, "Stream::get(int) failed to read padding\n", 40) = 40
write(2, "Can\'t receive reply from schedd on condor.lmcg.wisc.edu <192.168.0.149:47089>\n", 78) = 78


The corresponding strace from the schedd looks like:

recvfrom(14, "\1\0\0\0", 4, MSG_PEEK, NULL, NULL) = 4
recvfrom(14, "\1\0\0\0000", 5, 0, NULL, NULL) = 5
recvfrom(14, "\0\0\0\0\0\0\352bmax_jobs_running\0MAX_JOBS_RUNNING=4000\0\n", 48, 0, NULL, NULL) = 48


So it looks like the schedd doesn't send a reply, but why?