[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] DAGMAN, 7.4.0, and NFS



Craig A. Struble, Ph.D. wrote:
> I recently tried updating one of my pools to Condor 7.4.0. After doing
> so, dagman jobs that used to submit just fine with 7.3.x get the
> following error:
> 
> 11/21 12:31:24 ERROR: log file foo.log is on NFS.
> 
> I've tried setting in the pool's configuration file
> 
> ## Allow NFS Log files
> LOG_ON_NFS_IS_ERROR = False
> 
> and passing -allowlogerror to dagman, neither of which seem to have any
> effect.
> 
> While I understand log files on NFS filesystems are dangerous and
> unstable, we've run these jobs thousands of times without problems. I'd
> like some mechanism for making this a warning instead of an error.
> 
>     Craig
> -- 
> Craig A. Struble, Ph.D. | 369 Cudahy Hall  | Marquette University
> Associate Professor of Computer Science    | (414)288-3783
> Director, Master of Bioinformatics Program | (414)288-5472 (fax)
> http://www.mscs.mu.edu/~cstruble | craig.struble@xxxxxxxxxxxxx

Craig,

A quick scan of the code (src/condor_dagman/job.cpp) suggests this test cannot be skipped.

I believe that portions of the code are overly cautious of NFS. It should be possible to override the test.

Kent, is there an extraordinary concern about log files being on NFS if they are only read/written from a single node (the submit node) where dagman is running?

Best,


matt