[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] How to stop condor from removing working directory (fwd)



Hi Steve,

Steven Timm wrote:
There are several "working directories" that you could be
referring to, are you talking about on the remote (server) side
or the client side, and on the server side are you talking about
the directory in which the job runs, or the globus temporary directory.

Yes. In fact, I mean all of them on the server side. I wasn't aware of any temporary directories created per-job on the client side when using Condor-G (via OSG and "grid" universe jobs). Are there any?

There are four on the server side that I know of:

Globus GASS cache dir: ~/.globus/.gass_cache/md5/??/hash1/md5/??/hash2 (contains hard-linked executable, always (annoyingly) named "data" with OSG)

Logs: ~/.globus/job/FQDN/####.######### (contains stdout/err, local classad, proxy cert, io URL)

IWD: ~/gram_scratch_randomstring (contains the actual working directory of the job -- not sure what happens with systems using local directories rather than shared directories for jobs)

GRAM log: $V_L/globus/tmp/gram_job_state (contains files named gram_condor_log.####.######## which match the logs number above -- in fact, I've just noticed these don't seem to be cleaned up, and I have 2300 log and lock files from as far back as July)

There is a configuration option for the globus gatekeeper/globus-job-manager
to not delete the temporary files on a successful job,
or ones it thinks are successful such as "globus error 155".

This is exactly the error I am getting (globus error 155: cannot transfer output files). Can you shed any light on it? We have NAT, but no firewall, and the problem is intermittent.

The file is $VDT_LOCATION/globus/etc/globus-job-manager.conf
and the option is -save-logfile.  default is on_error,
I believe the other option is "always" to save everything
but you have to be careful because stuff will fill up fast.

Thanks.

Ian

--
Ian Stokes-Rees, Research Associate
SBGrid, Harvard Medical School
http://sbgrid.org