[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Defining an exit script for condor jobs



Have you had a look at the condor perl module? It allows you to specify a
function to be called when th job exits (you can specify clean exit,...) and
you could perform your cleanup in that function.

Cyclotron Institute, Texas A&M university
ZIP 77843-3366
(979)-845-1411 ext. 258
Mobile: (979)-571-9782
homepage: http://demon.ulb.ac.be/yeehaa/yeehaa.html  

> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx 
> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Terrence Martin
> Sent: Thursday, October 06, 2005 14:13
> To: Condor-Users Mail List
> Subject: [Condor-users] Defining an exit script for condor jobs
> 
> I asked this question a couple months ago but I wanted to put 
> it out again because I did not follow up on the one response I got.
> 
> My question was whether it is possible to have a script run 
> on job exit that can go beyond what the normal condor exit 
> does in terms of cleaning up areas. This is important in the 
> current Open Science Grid clusters I am working with since 
> often user files are stored in temporary area that condor 
> does not necessarily know about. It would be nice to have 
> this area cleared on exit.
> 
> The answer I got was either use a wrapper or Dagman.
> 
> The first solution does not work, that is if I follow the 
> rules for USER_JOB_WRAPPER in the condor documentation to not 
> have the wrapper fork a child and only call exec.  I can do 
> that but it is not clear I should. What would be nice is that 
> in addition to USER_JOB_WRAPPER there was a 
> USER_JOB_EXIT_SCRIPT which could define a script that 
> performs certain cleanup steps on job exit.
> 
> As far as DAGman, I am not sure how that would help. DAGman 
> from the condor documentation is meta-scheduler that submits 
> to condor. That sounds like it works on the outside between 
> the user and condor.  The grid software I work with is 
> already thick with schedulers to condor and I cannot enforce 
> what users make use of on that side. All I can control is my 
> condor queue and my worker nodes. Admittedly my knowledge of 
> dagman extends to what I read here 
> http://www.cs.wisc.edu/condor/dagman/
> but it does not sound like what I am looking for.
> 
> I guess I have another option and try to be clever. Just 
> before my user wrapper drops to the actual job I could start 
> a monitoring process that watches for the job to exit and 
> then try to cleanup. It would be simpler and probably less 
> error prone if condor could just trigger a cleanup process 
> though.  This would also have to end up being an orphan 
> process since the parent calls an exec right after it spawns 
> the monitor.
> 
> Terrence
> 
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>