[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] Trouble with the log file andCondor.pmsubMonitor()...



I forgot to add:

This is a bug in Condor::Monitor() -- while Condor use a combination of
initial_dir and log when the log setting is relative, the Monitor() sub
only every looks at log which is why I had to add the final log setting
in the submission ticket. Really the monitor process should have been
using the last log + initial_dir setting before the last queue call, but
it doesn't.

Cheers!
Ian

> -----Original Message-----
> From: Ian Chesal 
> Sent: July 5, 2004 3:35 PM
> To: 'Condor-Users Mail List'
> Subject: RE: [Condor-users] Trouble with the log file and 
> Condor.pmsubMonitor()...
> 
> 
> Okay. I have a solution to my problem. I had to play around 
> with different combinations of the log and initial_dir 
> settings to get the exact behaviour I wanted: I want the 
> output from my jobs caputured in separate directories for 
> each job, but a I want a central log file written for all the 
> jobs that the Monitor() process can watch and act on. 
> 
> The attached submission ticket shows how I fixed the problem. 
> I used a relative log file setting and reset it after every 
> initial_dir call so it always pointed to the same log file. 
> At the end of the submission ticket, after the last queue 
> call, I reset the log setting so the perl script (which uses 
> the last log instance found in the ticket when you call 
> Monitor) finds the right log file.
> 
> Now it all works the way I want it to work.
> 
> Cheers!
> Ian
> 
>  
> 
> > -----Original Message-----
> > From: condor-users-bounces@xxxxxxxxxxx
> > [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Ian Chesal
> > Sent: July 5, 2004 1:13 PM
> > To: Condor-Users Mail List
> > Subject: RE: [Condor-users] Trouble with the log file and 
> > Condor.pmsubMonitor()...
> > 
> > 
> > This is close, but I'm still experiencing problems with the
> > module. I added the full path to my log file and now I'm only 
> > getting one log file for all my jobs. This is great. The Perl 
> > module isn't complaining that it can't find the log file 
> > anymore. But the log file is not being updated properly.
> > 
> > If I have two jobs in my submission ticket my log file gets
> > written to when the jobs are submitted. But then when the 
> > jobs begin execution and finish execution nothing gets 
> > written to the log file so calling Condor::Monitor( $cluster 
> > ) in my perl script is causing the script to "hang" -- it 
> > doesn't know the jobs have completed.
> > 
> > I've attached the submission ticket, output from
> > condor_submit (as captured by the perl module call to 
> > Condor::Submit) and the log file for reference -- am I losing 
> > the log file updates because I'm changing initial_dir even 
> > though I've specified an absolute path to the log file? 
> > That's my suspicion. I'll try playing around with 
> > re-specifying log when I change initial_dir.
> > 
> > Thanks!
> > 
> > Ian
> > 
> > > -----Original Message-----
> > > From: condor-users-bounces@xxxxxxxxxxx 
> > > [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Jaime Frey
> > > Sent: June 30, 2004 9:45 PM
> > > To: Condor-Users Mail List
> > > Subject: Re: [Condor-users] Trouble with the log file and
> > > Condor.pm subMonitor()...
> > > 
> > > 
> > > On Jun 30, 2004, at 3:06 PM, Ian Chesal wrote:
> > > 
> > > > I seem to be unable to use the Monitor() subroutine from
> > > the Condor.pm
> > > > module because of my cluster configuration. I'm getting 
> the error:
> > > >
> > > > error opening ./condor.0.ticket.log: No such file or directory
> > > >
> > > > My submission ticket for my cluster of jobs looks like this:
> > > >
> > > > # GLOBAL EXPERIMENT CONFIGURATION
> > > >
> > > > universe = vanilla
> > > > should_transfer_files = YES
> > > > when_to_transfer_output = ON_EXIT
> > > > log = ./condor.0.ticket.log
> > > > output = wrapper.log
> > > > error = wrapper.err
> > > > executable = wrapper.bat
> > > >
> > > > # LOCAL EXPERIMENT CONFIGUARTION
> > > >
> > > > # Job 1
> > > > initial_dir = no_sweep_parameter/adc_fir1
> > > > arguments = 
> /experiments/ichesal/test/no_sweep_parameter/adc_fir1
> > > > requirements = (Arch  == "INTEL" && (OpSys == "WINNT40"
> > || OpSys ==
> > > > "WINNT50" || OpSys == "WINNT51")) queue
> > > >
> > > > # Job 2
> > > > initial_dir = no_sweep_parameter/art_core
> > > > arguments = 
> /experiments/ichesal/test/no_sweep_parameter/art_core
> > > > requirements = (Arch  == "INTEL" && (OpSys == "WINNT40"
> > || OpSys ==
> > > > "WINNT50" || OpSys == "WINNT51")) queue
> > > >
> > > > The trouble is with the condor.0.ticket.log file -- 
> it's not being 
> > > > created globally for all the jobs in the cluster. Instead
> > there's a
> > > > log file for each job like this:
> > > >
> > > > 	no_sweep_parameter/adc_fir1/condor.0.ticket.log
> > > > 	no_sweep_parameter/art_core/condor.0.ticket.log
> > > >
> > > > Did I set the job up wrong? I was expecting only one
> > > > condor.0.ticket.log in the same directory where I submitted the 
> > > > cluster of jobs from.
> > > 
> > > If you give a relative pathname for you user logfile, 
> it's resolved 
> > > relative to initial_dir. If initial_dir isn't specified, 
> then it's 
> > > resolved relative to the current working directory that
> > > condor_submit 
> > > is run from.
> > > 
> > > 
> > 
> +----------------------------------+---------------------------------+
> > > |            Jaime Frey            | I stayed up all night 
> > playing   |
> > > |        jfrey@xxxxxxxxxxx         | poker with tarot 
> > cards. I got a |
> > > |  http://www.cs.wisc.edu/~jfrey/  | full house and four
> > people died.|
> > > 
> > 
> +----------------------------------+---------------------------------+
> > > 
> > > _______________________________________________
> > > Condor-users mailing list
> > > Condor-users@xxxxxxxxxxx 
> > > http://lists.cs.wisc.edu/mailman/listinfo/cond> or-users
> > > 
> > 
>