[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] dedicated_scheduler failing to deactivate claims.




Hi,

we are seeing an issue with the Condor dedicated scheduler for the
parallel universe leaving active claims behind. The startd on the
execute node says:

02/10/15 12:18:45 condor_write(): Socket closed when trying to write 28 bytes to <192.168.10.100:36614>, fd is 3
02/10/15 12:18:45 Buf::write(): condor_write() failed
02/10/15 12:18:45 Failed to send response ClassAd in deactivate_claim.

when trying to send the response to a DEACTIVATE_CLAIM command issued
by the dedicated scheduler. Meanwhile, the dedicated scheduler already closed the command socket, as DedicatedScheduler::deactivateClaim() - and DedicatedScheduler::releaseClaim() as well, on the most recent stable and development versions - aren't decoding any response.

Was ship&pray an intended feature for these commands or is this a bug ? If this is a feature, should we always let the claims live to the end of their worklife ?

Thank you for a friendly ear.
Francesco Prelz
INFN - Milan