[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Condor fault tolerance



Hello,

what kind of fault tolerance does Condor offer? If a job is running on a node, and a power failure occurrs, does Condor detect that the node is off and does it re-start it on another node? I tried this on a cluster, and even though Condor detected the node restarted, it still listed the job as running, even though it wasn't. Is this common behavior? Thanks,

JVFF


See all the ways you can stay connected to friends and family