[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Vacating job and attaching meta data for the next host to take over the vacated job



On Tue, 22 Feb 2005 14:55:29 -0500, Dave Lajoie <davelaj@xxxxxxxxxxxx> wrote:
> Tx Matt, pls see below
> 
> few more clarifications...
> I don't use checkpointing and the universe used is Vanilla, as you might
> have already guessed ;)
> 
> From what I understand I must use preemption, since if there a render job
> with higher priority I really want to inform the machines which an Higher
> priority job has been matched for those (Preempting right?) and somehow this
> machines needs to vacate the current running job, but not at all expense. I
> want flexible preemption rules and give a chance with the current rendered
> frame, by letting it finish, prior accepting the vacation(due to preempting)
> notification.

Indeed - if you wish to have higher priority jobs kick other ones you
have no option except to use preemption.

Your possibilities for allowing a nearly completed job to finish are:

Provide a set amount of retirement time to the job so that ones near
the end of their task will complete - this has the side effect of
letting jobs which aren't goign to make it anyway have a bit longer to
(potentially) waste cycles and increases the latency. (it's a fine
line this as to whether the benefit outweighs the cost)

let the KILL expression spot whether the job is close to finishing and
give it a bit of extra time (requires you talking back to the
submission job queue see below)
This has few down sides except:
a) you will terminate the job due to it's completion but condor will
take that as a vacation checkpoitn thus transfering all your files and
then sending the job to another machine at some point where it would
have to spot that it was a restart of a finished job and kill it self
immediately accordingly - not very nice.
b) You absolutely would have to alter the jobs classad
  - note for b I do not know if the startd caches the jobs classad
from when it started so this may not work anyway...
 
> That is what I thought I would do at the beginning. I am seeking ways to
> promote the information without poluting file server with temporary files
> which eventually need to be cleaned afterwards.
> I guess I will have to pollute ;)

the only form of communication you have between runs on seperate
machines from condor itslef is written files - you could run a
database and have the jobs update state in their but this may cause
you other (security/bottleneck) issues
 
> > you could hack round this and use condor_qedit on a user set classad on
> > the job but be careful.
> 
> condor will be installed on ALL machines, such my perl wrapper can have
> access to condor_qedit, but afterthough, it might be a bad idea, since it
> will not be real time. The collector get the machine specific attributes
> every 5 mins or so ( i think right?) I may want to implement a timer/alarm
> in my perl wrapper such that "latest" render progression is published in a
> text file, on the node's log directory (on the network), which can then be
> parsed by another script ( web page / cgi bin / php ). I think will keep it
> simple so it is reliable and predictable ;)

while the collector will get the machine specifics the evaluateion of
*job* classads normally invoves communication to the submitting schedd
- again I do not know if there is a local cache such that the talk
back method I describe wouldn't work.

Note however that 5 min latency is not that big a deal - if you
beleive it is you may be disappointed with condor since there is
considerable overhead to using condor for very small very fast jobs,
especially with preemption going on. Your jobs should be tuned to take
a _minimum_ 15 mins or so really...
 
> I fear I can't walk away from preemption in this setup since the renderfarm
> must remains responsive whenever higher priority jobs are made available.

In that case you have some problems - the 2nd route seems like the
only one available to you. I would suggest that your idea for an
external watcher is reasonable but that it will require access to all
the queues to edit the jobs state to reflect it's progress (or lack of
it) and again I don't know if this will work.

Also you haven't said how you mean to distinguish higher priority
jobs. This is something condor isn't very good at (job rather than
user ranking)
The only easy way to achieve it is to use machine specific rank
setting. this howvever ALWAYS leads to preemption (you cannot prevent
it) thus you would have to use the preemption rank to pick the least
bad candidate and accept that the times when there isn't a nice one
you just get the best of a bad choice...
 
> > Note that the default preemption rank is to preempt the longest
> > running jobs - while fine for standard as a default this is sorely
> > lacking for a vanilla only farm - I recommend inverting this logic to
> > start with
> 
> Good to know

Indeed
 
> if Preemption_rank is based on how long a job has been active, I think
> I may run into problem where hanged application could be running for
> several hours, thus lowering the probability of being preempted. ( as you
> mentioned )

Yes - hence your external watcher should probably try and spot such
jobs and remove them from the queue
 
> That the reason why I was interested to "inject" a new job custom attribute
> which would get considered by the preemption_rank.
> 
> Example: if JOB_RENDER_PROGRESSION > 80 then tune ranking to lower
> probability of being preempted

That would be a reasonable strategy so long as the startd's evaluation
of the current job's classad is not always a cached copy of when it
was started (you need an answer from someone from the condor team for
that)

Matt