[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] Trouble with the log file andCondor.pmsubMonitor()...



Okay. I have a solution to my problem. I had to play around with
different combinations of the log and initial_dir settings to get the
exact behaviour I wanted: I want the output from my jobs caputured in
separate directories for each job, but a I want a central log file
written for all the jobs that the Monitor() process can watch and act
on. 

The attached submission ticket shows how I fixed the problem. I used a
relative log file setting and reset it after every initial_dir call so
it always pointed to the same log file. At the end of the submission
ticket, after the last queue call, I reset the log setting so the perl
script (which uses the last log instance found in the ticket when you
call Monitor) finds the right log file.

Now it all works the way I want it to work.

Cheers!
Ian

 

> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx 
> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Ian Chesal
> Sent: July 5, 2004 1:13 PM
> To: Condor-Users Mail List
> Subject: RE: [Condor-users] Trouble with the log file and 
> Condor.pmsubMonitor()...
> 
> 
> This is close, but I'm still experiencing problems with the 
> module. I added the full path to my log file and now I'm only 
> getting one log file for all my jobs. This is great. The Perl 
> module isn't complaining that it can't find the log file 
> anymore. But the log file is not being updated properly.
> 
> If I have two jobs in my submission ticket my log file gets 
> written to when the jobs are submitted. But then when the 
> jobs begin execution and finish execution nothing gets 
> written to the log file so calling Condor::Monitor( $cluster 
> ) in my perl script is causing the script to "hang" -- it 
> doesn't know the jobs have completed.
> 
> I've attached the submission ticket, output from 
> condor_submit (as captured by the perl module call to 
> Condor::Submit) and the log file for reference -- am I losing 
> the log file updates because I'm changing initial_dir even 
> though I've specified an absolute path to the log file? 
> That's my suspicion. I'll try playing around with 
> re-specifying log when I change initial_dir.
> 
> Thanks!
> 
> Ian
> 
> > -----Original Message-----
> > From: condor-users-bounces@xxxxxxxxxxx
> > [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Jaime Frey
> > Sent: June 30, 2004 9:45 PM
> > To: Condor-Users Mail List
> > Subject: Re: [Condor-users] Trouble with the log file and 
> > Condor.pm subMonitor()...
> > 
> > 
> > On Jun 30, 2004, at 3:06 PM, Ian Chesal wrote:
> > 
> > > I seem to be unable to use the Monitor() subroutine from
> > the Condor.pm
> > > module because of my cluster configuration. I'm getting the error:
> > >
> > > error opening ./condor.0.ticket.log: No such file or directory
> > >
> > > My submission ticket for my cluster of jobs looks like this:
> > >
> > > # GLOBAL EXPERIMENT CONFIGURATION
> > >
> > > universe = vanilla
> > > should_transfer_files = YES
> > > when_to_transfer_output = ON_EXIT
> > > log = ./condor.0.ticket.log
> > > output = wrapper.log
> > > error = wrapper.err
> > > executable = wrapper.bat
> > >
> > > # LOCAL EXPERIMENT CONFIGUARTION
> > >
> > > # Job 1
> > > initial_dir = no_sweep_parameter/adc_fir1
> > > arguments = /experiments/ichesal/test/no_sweep_parameter/adc_fir1
> > > requirements = (Arch  == "INTEL" && (OpSys == "WINNT40" 
> || OpSys ==
> > > "WINNT50" || OpSys == "WINNT51")) queue
> > >
> > > # Job 2
> > > initial_dir = no_sweep_parameter/art_core
> > > arguments = /experiments/ichesal/test/no_sweep_parameter/art_core
> > > requirements = (Arch  == "INTEL" && (OpSys == "WINNT40" 
> || OpSys ==
> > > "WINNT50" || OpSys == "WINNT51")) queue
> > >
> > > The trouble is with the condor.0.ticket.log file -- it's not being
> > > created globally for all the jobs in the cluster. Instead 
> there's a 
> > > log file for each job like this:
> > >
> > > 	no_sweep_parameter/adc_fir1/condor.0.ticket.log
> > > 	no_sweep_parameter/art_core/condor.0.ticket.log
> > >
> > > Did I set the job up wrong? I was expecting only one 
> > > condor.0.ticket.log in the same directory where I submitted the 
> > > cluster of jobs from.
> > 
> > If you give a relative pathname for you user logfile, it's resolved
> > relative to initial_dir. If initial_dir isn't specified, then it's 
> > resolved relative to the current working directory that  
> > condor_submit 
> > is run from.
> > 
> > 
> +----------------------------------+---------------------------------+
> > |            Jaime Frey            | I stayed up all night 
> playing   |
> > |        jfrey@xxxxxxxxxxx         | poker with tarot 
> cards. I got a |
> > |  http://www.cs.wisc.edu/~jfrey/  | full house and four 
> people died.|
> > 
> +----------------------------------+---------------------------------+
> > 
> > _______________________________________________
> > Condor-users mailing list
> > Condor-users@xxxxxxxxxxx
> > http://lists.cs.wisc.edu/mailman/listinfo/cond> or-users
> > 
> 

Attachment: condor.0.ticket.out
Description: condor.0.ticket.out

Attachment: condor.0.ticket
Description: condor.0.ticket

Attachment: condor.0.ticket.log
Description: condor.0.ticket.log