[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Fast shutdown happens frequently on one node



Dear condor Experts:

  From time to time the condor service on one machine turns off automatically, and in the MasterLog I can see:

08/25/15 15:39:54 The DaemonShutdownFast _expression_ "1000000" evaluated to TRUE: starting fast shutdown
08/25/15 15:39:54 Got SIGQUIT.  Performing fast shutdown.
08/25/15 15:39:54 Sent SIGQUIT to STARTD (pid 3333)
08/25/15 15:40:00 AllReaper unexpectedly called on pid 3333, status 0.
08/25/15 15:40:00 The STARTD (pid 3333) exited with status 0
08/25/15 15:40:00 All daemons are gone.  Exiting.
08/25/15 15:40:00 **** condor_master (condor_MASTER) pid 3306 EXITING WITH STATUS 99


However, in the configuration files I didn't see the setting of  DAEMON_SHUTDOWN or DAEMON_SHUTDOWN_FAST.


node029:~# condor_config_val -dump | grep SHUTDOWN
EVENTD_SHUTDOWN_CLEANUP_INTERVAL = 3600
EVENTD_SHUTDOWN_CONSTRAINT =
EVENTD_SHUTDOWN_SLOW_START_INTERVAL = 0
EVENTD_SHUTDOWN_TIME =
EVENTD_SIMULATE_SHUTDOWNS =
NEGOTIATOR_TRIM_SHUTDOWN_THRESHOLD = 0
SHUTDOWN_FAST_TIMEOUT = 300
SHUTDOWN_GRACEFUL_TIMEOUT =
STARTD_FACTORY_SCRIPT_SHUTDOWN_PARTITION =
STARTD_NOCLAIM_SHUTDOWN = 0

  Any idea what lead to this fast shutdown ?

  Cheers,Gang