[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Network filesystem failed to initialize logs



Are you using autofs to mount the glusterfs fuse mount?   If so, my guess is you have a race condition whereby autofs is not mounting before the condor jobs show up.   This happened to us a lot too (we use autofs).   The way we got around this is by applying a precondition to the job (through a shell script) to touch a file in the directory.

I hope this helps,
    -C

On 8/29/19 5:26 PM, JoÃo BaÃto wrote:
Hi,

Some of my users keep giving their jobs put on hold due to problems with the initialization of the error and logs files. They are setting the path of these files to the network filesystem (glusterfs mount via fuse).

The only way to fix this is to force an ls on the target directory and then run condor_release.

Any ideas on why this is happening?

I'm running HTCondor v.8.8.4 on CentOS 7.6.

Thanks!
JoÃo BaÃto
---------------
Scientific Computing and Software Platform
Champalimaud Research
Champalimaud Center for the Unknown
Av. BrasÃlia, Doca de PedrouÃos
1400-038 Lisbon, Portugal

fchampalimaud.org

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


-- 


Christopher Harrison
Systems Engineer
Department of Biostatistics & Medical Informatics
University of Wisconsin School of Medicine and Public Health
Office 240 Warf
610 Walnut Street
Madison, WI 53726
608.3476.6967