[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] High availability condor_schedd, shared file systems





Has anyone yet tried to set up a high availability condor_schedd
as described in section 3.10.1 of the condor manual?
If so, what solution are you using for the shared file system? I would
like to find a solution that doesn't involve NFS, if possible.
Has anyone tried AFS for this purpose?  GFS?

And what is the effect when a backup schedd on a different IP takes
over the job queue?  condor_submit will not automatically fail over
to the new IP, will it?  What about all the ClaimId's, do those work OK?
Has anyone else tried to use linux-HA to move the schedd IP to the backup machine when the master schedd fails out so that the backup schedd starts
not only with the same job_queue.log but the same IP as before?

Thanks

Steve Timm



--
------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.