[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Defining an exit script for condor jobs
- Date: Thu, 06 Oct 2005 12:13:12 -0700
- From: Terrence Martin <tmartin@xxxxxxxxxxxxxxxx>
- Subject: [Condor-users] Defining an exit script for condor jobs
I asked this question a couple months ago but I wanted to put it out
again because I did not follow up on the one response I got.
My question was whether it is possible to have a script run on job exit
that can go beyond what the normal condor exit does in terms of cleaning
up areas. This is important in the current Open Science Grid clusters I
am working with since often user files are stored in temporary area that
condor does not necessarily know about. It would be nice to have this
area cleared on exit.
The answer I got was either use a wrapper or Dagman.
The first solution does not work, that is if I follow the rules for
USER_JOB_WRAPPER in the condor documentation to not have the wrapper
fork a child and only call exec. I can do that but it is not clear I
should. What would be nice is that in addition to USER_JOB_WRAPPER there
was a USER_JOB_EXIT_SCRIPT which could define a script that performs
certain cleanup steps on job exit.
As far as DAGman, I am not sure how that would help. DAGman from the
condor documentation is meta-scheduler that submits to condor. That
sounds like it works on the outside between the user and condor. The
grid software I work with is already thick with schedulers to condor and
I cannot enforce what users make use of on that side. All I can control
is my condor queue and my worker nodes. Admittedly my knowledge of
dagman extends to what I read here http://www.cs.wisc.edu/condor/dagman/
but it does not sound like what I am looking for.
I guess I have another option and try to be clever. Just before my user
wrapper drops to the actual job I could start a monitoring process that
watches for the job to exit and then try to cleanup. It would be simpler
and probably less error prone if condor could just trigger a cleanup
process though. This would also have to end up being an orphan process
since the parent calls an exec right after it spawns the monitor.