[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] condor_status update time


Hi Everyone,


My apologies if this has been asked already, or if I missed a notification.   I have searched and not found any references to this question.


There appears to be a new delay in the updates on availability of a core in condor_status for the pool, when the htcondor on a machine is stopped?     I am pretty sure that delay was not there before?


Example:   If I have 80 cores (16 cores split over 5 VMs that are only running startdâs) and they are all up, then condor_status correctly shows 80 cores.   If I then shutdown HTCondor on one of the VMs â a ps shows that the condor processes are gone, but condor_status does not update and reflect that the number of cores is down to 64 for many minutes (as many as 10 or 15 minutes). 


I believe that this is new behavior in 8.6 (we are currently running 8.6.6).  I double checked in our 8.4 pool before we updated it and I am pretty sure that it did not have that behavior, meaning a shutdown of HTCondor on a VM in a pool was immediately reflected in condor_status.


Is this behavior expected?   Is there a better way (other than the ps) to determine what cores are really there with a reliable immediate answer?   We have been troubleshooting some issues which have required a number of shutdowns and startups and it has become an issue (really just a pain in theâ. - there are other ways to tell) that the condor_status result is not a true current reflection of the status of the pool.   Did I miss a new knob or a new command?  :) 


            Thank You -- Mary


Mary Romelfanger

Deputy Branch Manager

Data Systems Branch


{o,o}      Phone 410-338-6708
/)__)      Cell      443-244-0191
-"-"-          mary@xxxxxxxxx


Space Telescope Science Institute

3700 San Martin Drive

Baltimore, MD 21218