If I understand the Condor manual correctly,
high availability for submit machines requires that there is only one submission
point. If I have 4 submit machines, any of which can be submitting jobs
via different users, is it possible to set up HA such that if any one(or
more than one) of these SCHEDDs go down one of the other SCHEDDs can pick
up the jobs?
The config macro settings do not seem
to lend themselves to support this and therefore I am wondering if anyone
can clarify whether HA for SCHEDDs can support multiple submission points.
I believe it would be a limitation for us to have only one submit machine,
because we are often submitting a thousand or so jobs and the heap or memory
could be a limiting factor.