[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Automatically detecting job completion and file transfer



Thanks Sateesh,

The manual for condor_wait says that this simply parses the log file for the
job, which is what my program is doing.  So I guess it will suffer from the
same problem - can anyone refute this?

Jon 

> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx 
> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Sateesh Potturu
> Sent: 12 April 2006 12:14
> To: Condor-Users Mail List
> Subject: Re: [Condor-users] Automatically detecting job 
> completion and file transfer
> 
> We invoke condor_wait for this.
> 
> I am not sure if there are better alternatives to this.
> 
> Regards,
> Sateesh
> Technical Consultant, Wipro Technologies
> Tel: +91 80 30294381 Mobile: +91 9845205258 
> E-mail:sateesh.potturu@xxxxxxxxx
> 
> 
> Jon Blower wrote on 04/12/2006 03:33 PM:
> > Dear all,
> >
> > This question has probably been asked before but I haven't 
> been able 
> > to find an answer on Google or the mailing list archives.  
> I'm writing 
> > a Java program that submits jobs to a Condor pool.  The 
> Java program 
> > runs on a submit host and generates job description files 
> that look like this:
> >
> > executable = /home/jon/bin/helloworld
> > universe = vanilla
> > input = stdin
> > output = stdout
> > error = stderr
> > log = condor.log
> > initialdir = /some/directory
> > Queue
> >
> > I submit the job by calling condor_submit from Java's 
> Runtime.exec() method.
> > This bit works fine.  
> >
> > My problem is detecting categorically when the job has 
> completed *and* 
> > the output files (stdout and stderr) have been transferred 
> back to the 
> > submit host.  My first stab at the Java program detects the 
> status of 
> > the job ("submitted", "running", "complete") by parsing the 
> log file 
> > that is produced.  It also gets the exit code of the 
> executable from this log file.
> >
> > To detect job completion, my program looks for the "005" event ("Job
> > terminated") in the log file.  However, it seems that this event is 
> > sent to the log file *before* the contents of the stdout and stderr 
> > files are transferred to the submit host.  If I check the length of 
> > the stdout and stderr files on the submit host (using the length() 
> > method of java.io.File) they both report zero immediately after the 
> > "005" event is detected in the log file.  If I wait a few 
> seconds, the 
> > length() method reports the correct length, indicating that these 
> > files (or at least their contents) are transferred a few 
> seconds after the "005" event.
> >
> > Does Condor send any notifications (to the log file or 
> elsewhere) that 
> > happen categorically after the job is finished and the 
> files have been 
> > transferred to the submit host?
> >
> > Thanks in advance for any advice,
> > Jon
> >
> > -------------------------------------------------------------- 
> > Dr Jon Blower              Tel: +44 118 378 5213 (direct line) 
> > Technical Director         Tel: +44 118 378 8741 (ESSC) 
> > Reading e-Science Centre   Fax: +44 118 378 6413 
> > ESSC                       Email: jdb@xxxxxxxxxxxxxxxxxxxx 
> > University of Reading
> > 3 Earley Gate
> > Reading RG6 6AL, UK
> > --------------------------------------------------------------
> >  
> >
> > _______________________________________________
> > Condor-users mailing list
> > Condor-users@xxxxxxxxxxx
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> >   
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>