[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Job continually being run due to shadowexceptionerrors.



Matt and Jaime

I modified the submit file as below:

executable = egs.exe
environment = XPERT_DIR=\\arthur-lu\montecarlo
output     = D7EG9AB.log
log        = D7EG9AB.condorlog
arguments  = D7EG9AB.egs
universe   = vanilla
transfer_input_files = D7EG9AB.egs,auto_design7.pegsdat
#transfer_input_files = egs.exe,D7EG9AB.egs,auto_design7.pegsdat
#transfer_output_files = D7EG9AB.log,D7EG9AB.condorlog
queue

and the problem seems to go away. As you said, there is no need
to explicitly include the default *.err, *.out, *.log or executable
in the transfer statements. Removing them, as in the submit file
above, seems to fix the problem. However I would have thought that
explicity including them shouldn't? have caused any problems.

I guess what you're saying is by explicity including the log file
it is created on the execute machine and then transferred back to
the submit machine upon completion? And this is when the file transfer
error was occurring, maybe because it was trying to transfer a file
that no longer existed? or was trying to overwrite one that was
already there on the submitting machine?

Thanks for all your help. I'll let you know how we go.

Cheers

Greg

> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx 
> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Matt Hope
> Sent: Thursday, 16 February 2006 4:33 PM
> To: Condor-Users Mail List
> Subject: Re: [Condor-users] Job continually being run due to 
> shadowexceptionerrors.
> 
> 
> On 2/16/06, Greg.Hitchen@xxxxxxxx <Greg.Hitchen@xxxxxxxx> wrote:
> > The D7EG9AB.condorlog file is simply the standard condor user log 
> > file. Shouldn't condor be creating this, not the job?
> >
> > The submit file is listed below.
> >
> 
> > log        = D7EG9AB.condorlog
> 
> > transfer_output_files = D7EG9AB.log,D7EG9AB.condorlog
> 
> Jamie stated this but just to make it crystal clear, the 
> output and error arguments refer to what files condor will 
> place the stdout stderr streams into (if any) and transfer back.
> 
> The log argument specifies a file on the local (schedd/shadow 
> end) to write information concerning the progress of the 
> job/cluster, there is no need to specify this as a file to 
> transfer (since it is never touched by the execution 
> (startd/starter) part of the process.
> 
> In fact specifying explicit file transfers is generally 
> unecessary unless you have some performance reasons not to 
> send things back such a temp files created in the local 
> directory. If you can make your job plce only useful stuff in 
> the working dirctory and use %TEMP%, /tmp or some sub 
> driectory of the working directory then these won't be 
> transferred back anyway.
> 
> Matt
> 
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx 
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>