Re: [Condor-users] Submitting MPI JOb

On Mon, Jun 23, 2008 at 2:49 AM,  <txcom2003@xxxxxxxxxxxxxxxxxxx> wrote:
> No, I submitted the second job when Schedd has release those claims and
> the status of all machine was UNCLAIMED. So the condition is same as the
> first job.
So it is likely that your first submission occurred just after a
negotiation cycle, and had to wait for a majority of the negotiation
cycle. The second submission may have occurred right before the next
negotiation cycle, and got matched immediately.

To help address this, you should play with the settings:
NEGOTIATOR_INTERVAL - This affects how often negotiation occurs, and
can be shortened depending upon some specifics about your
installation. If you set this way too short (say 10 seconds), and you
have machines on different networks that take a while to start jobs,
you might get some thrashing. If your machines are close together
network wise, you can safely set this smaller. I've made this very
small depending upon the environment's network speed. Setting this to
30 can work in some environments provided you
NEGOTIATIOR_CYCLE_DELAY - This is a required delay between any two
negotiation cycles, and can be set smaller than the default (20
seconds I think). I've made this 3-5 seconds in situations where all
the machines are on the same switch. Even on larger installations
putting this at 10-15 seconds is generally acceptable.

Both of these changes should be made on the Central Manager's
configuration. Additionally, Condor 7.0 contains a number of
matchmaking/parallel job improvements, so if you're using 6.8, you
might consider upgrading. As with all negotiation cycle tweaking
you'll want to watch the NegotiatorLog to ensure that machines that
were matched don't get unmatched during the next cycle because these
settings were too low.

Hope this helps,

