[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Capabilities of schedd HA
I've been playing with schedd HA. I haven't quite gotten the
configuration right, but before I put any more time into it, I want to
make sure that it can do what I'm hoping it can.
Those who have read/replied to my earlier posts will recall that my
Condor setup must have no single point of failure. I'm currently working
on schedd. Schedd now runs on the CMs and needs only to take submissions
from the active CM. I've tested CM failover while a job was executing.
While negotiator did failover, the job was not able to complete until
failback. Is there any way around this (e.g. shared file system between
CMs)? Or am I misunderstanding these mechanisms? I would really like to
be able to implement CM and schedd failover that is transparent to job