[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] 6.8.0 and NFS Problem



On Sat, 5 Aug 2006, Nick LeRoy wrote:

> On Saturday 05 August 2006 6:06 am, vetter wrote:
> > OK, perhaps I should ask less questions:
> > why is the following  message?
> >
> > ~> condor_submit ls.job
> > Submitting job(s)
> > WARNING: Log file /home/vetter/test.log is on NFS.
> > This could cause log file corruption and is _not_ recommended.
> > .
> > Logging submit event(s).
> > 1 job(s) submitted to cluster 43.
> 
> Condor can't rely on the file locking mechanisms of NFS.  In particular, user 
> logs can get corrupted if written to by multiple writes (i.e. if you have 
> several jobs running at the same time writing to the same log).  If your user 
> job gets corrupted, this can confuse tools like DAGMan that rely on the user 
> log.  We have seen this problem occur often enough that we felt it warrented 
> the warning message.
> 
> Hope this helps

Ok, I understand. Is there a way to get rid of this warning for people 
that are aware of this? When started in a NFS directory, the Warning makes 
condor_run think that condor_submit did not work, but it works. So 
condor_run removes the jobfiles und when the job is run, it does not find 
the job files and does nothing.

A simple workaround is to modify condor_run to use a /tmp which is usually 
loacl to the machine.

Or one could make condor_run ignore this Warning.

Regards,
  Andreas