[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor Call back mechanism



On 5/31/07, Hemant Tanwar <hemant_tanwar@xxxxxxxxxxx> wrote:

 Hi,


We do not want to manually tail the log file.  Because the jobs are
submitted by the system on demand and there is no human being involved. so
we need to build a system ( via software programs) , which will
automatically take certain action after job is finished by condor.

Manipulating with the log file is not that sophisticated.



Registering a perl module has a shortcoming.... if we submit 4000 jobs there
will be 4000  perl module running and listening on status of job  which may
kill the system.



So  we wanted to handles inside condor  and when it  finishes a job, we want
to put a message in Message Queue or put a record in a database.



The Condor 7.0 release will have logs in database, and we can put DB
triggers to achieve the same thing. But I do not know  how to achieve this
in current release of condor.

Several solution 'fragments'

1) Have your jobs indicate their completion - if possible quick and
simple but issues of job being terminated just as it is sending this
message so it does the action twice.

2) Have your auto submit system keep a record of the log files of all
the jobs it runs.
The periodically scan those logs for a job start/end.
If you want to be clever you can spot a start and trigger a permanent
tail on the log file till it ends or the job is evicted thus getting
very rapid notifications for job completion in the general case for
some additional complexity.
If you submission system is in any kind of decent programming
environment you should be able to do this with threads rather than
separate processes which would reduce the cost of waiting on several
hundred files (file handles rather than threads would be the likely
limiting factor). If your submissions go through a web/app server then
care should be taken about the threading as those environments often
don't like you jumping into low level threading.

3) have the emails sent to some system which can convert that into the
relevant notification (smtp gateways can be a pain to manage though)

4) the condor_mail application can be redirected to any executable you
like which expects a certain set of command line arguments as well as
some data no stdin. Craft your own app which takes that input, parses
it (or just takes it as a hint to go off and look at the relevant
files to see what happened) and do the appropriate action, you could
have it daisy chain to any other app you wanted thus allowing the
normal mail program to be added too (potentially with you modifying
the arguments or email as you saw fit)

Number 4 is rather nice but needs some testing so you can see how the
data is passed across (and it is an assumed and therefore fragile API
but one which is unlikely to change massively in the near future). I
particularly like the idea of writing a basic framework app which
passes on the 'instruction' to another executable with simple
modification possibilities and then using that to plug in multiple
different implementations (MQ, nicer email, external database, your
system here)

Just some food for thought

Matt