[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] startd-cron not working from time to time (?)
- Date: Tue, 17 May 2016 11:26:44 +0200 (CEST)
- From: "Beyer, Christoph" <christoph.beyer@xxxxxxx>
- Subject: [HTCondor-users] startd-cron not working from time to time (?)
I am using the startd-cron-feature on 8.4.4 and it seems like the configured time period of 180 sec is working fine until the node is completely busy running all cores. In that case I observe that the startd-cron is not running anymore. In some cases tracing the startd process is sufficient to get the cron to run, sometimes a condor_reconfig does the job.
Is that a 'known issue' or is there a way around this behaviour ?
Here are my STARTD_CRON relevant config-lines:
STARTD_CRON_JOBLIST = NODEHEALTH
STARTD_CRON_NODEHEALTH_EXECUTABLE = /etc/condor/tests/healthcheck_wn_condor.sh
STARTD_CRON_NODEHEALTH_MODE = Periodic
STARTD_CRON_NODEHEALTH_PERIOD = 180s
/* Christoph Beyer | Office: Building 2b / 23 *\
* DESY | Phone: 040-8998-2317 *
* - IT - | Fax: 040-8994-2317 *
\* 22603 Hamburg | http://www.desy.de */