[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Scratch Directory problem



I'm running into a problem regarding temporary run directories. I am submitting jobs in Vanilla universe to a cluster of Linux Machines. For some reason, on one of them, rather than the jobs running in, for example, /var/lib/condor/execute/dir_<PID>/.…. where <PID> is a unique number tied to the process ID, in some cases, they end up running in roughly the same location they were submitted from (like /home/username/launch_dir)

The problem, then, is that multiple instances of the job are running in the same location, resulting in file conflicts and job failure. The problem is occurring on a machine that is also serving as a submit node, but other than that the local config files are the same.

Is there a variable I am missing here? 

Relatedly, I hoped that I could just exclude a machine in requirements like (Target.Machine != "xxx.xxx.xxx.xxx") but that didn't work either.

Thanks in advance for any ideas.

Cheers,
Mike

Mike Fienen
USGS Wisconsin Water Science Center