[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor configuration question



WANT_VACATE and PREEMPT are evaluated by the startd on the execute
machine.  PREEMPTION_REQUIREMENTS are evaluated by the negotiator.
You should be looking at the StartLog to see what is actually happening.
Also, by default a negotiation cycle only runs every 20 seconds so
PREEMPTION_REQUIREMENTS as you have it written is probably not
going to have time to kick in.

Steve Timm



On Fri, 30 Jan 2009, Balamurali Ananthan wrote:

Thanks for the reply Steve. Here is another question:

I want to preempt a job. And here is the configuration I have at both
the master/submit machine and in the execute machine:

[bala@node2 condor-7.1.4]$ condor_config_val WANT_VACATE
(CurrentTime - JobStart) > 10

[bala@node2 condor-7.1.4]$ condor_config_val PREEMPTION_REQUIREMENTS
(CurrentTime - JobStart) > 10

[bala@node2 condor-7.1.4]$ condor_config_val PREEMPT
(CurrentTime - JobStart) > 10

[bala@node2 condor-7.1.4]$ condor_config_val MaxJobRetirementTime
10

I did a condor_reconfig on all the machines. With this configuration in
place, I was expecting every job be preempted 10 seconds after it starts
and would have a 10 sec to do clean up and be killed. But the jobs that
I submit which usually runs for 3 mins runs for around 30 mins (a lot of
times I see the job in the Idle state) and gets completed which is not
expected.

Any idea on whats wrong with the configuration? I would like condor to
kill my jobs in 20 seconds.

Thanks.
.Bala.

Steven Timm wrote:
2 ways to do it
a) here is a preemption requirements statement much like
the UWCS default one.

[root@fcdf2x1 ~]# condor_config_val PREEMPTION_REQUIREMENTS
(((CurrentTime - EnteredCurrentState) > (1 * (10 * 60)) && RemoteUserPrio
SubmittorPrio * 1.2) && RemoteUser =!= "cdf@xxxxxxxx" && RemoteUser =!=
"cdffgrid@xxxxxxxx" && RemoteUser =!= "cdfnam@xxxxxxxx" && RemoteUser =!=
"cdfdev@xxxxxxxx"

All you have to do is to up the timestamp from more than 600 seconds,
as above, to however much time you want in seconds.

Second thing you can do is to use nonzero maxjobretirementtime
so things will still pre-empt but it will still have maxjobretirementtime
seconds to finish the job.

For both of the scenarios above machine RANK should be set to zero.

Steve Timm



On Tue, 6 Jan 2009, Balamurali Ananthan wrote:

Greetings!

Wondering if it is possible to configure condor in such a way that, a remote
user's job should not be preempted before a certain time is elapsed.

For example, userx submits a job that runs for more-or-less 10 hours. I want
to configure condor in such a way that the job once started on an execute
machine, should not be disturbed for 11 hours.

If this is possible, could someone please point me to the right
documentation.

Thanks much!
.Bala.

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/






--
------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.