[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Capabilities of schedd HA

I've been playing with schedd HA. I haven't quite gotten the configuration right, but before I put any more time into it, I want to make sure that it can do what I'm hoping it can.

Those who have read/replied to my earlier posts will recall that my Condor setup must have no single point of failure. I'm currently working on schedd. Schedd now runs on the CMs and needs only to take submissions from the active CM. I've tested CM failover while a job was executing. While negotiator did failover, the job was not able to complete until failback. Is there any way around this (e.g. shared file system between CMs)? Or am I misunderstanding these mechanisms? I would really like to be able to implement CM and schedd failover that is transparent to job completion.