[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] after upgrading 802, err and Output not captured



On Mar 6, 2014, at 6:02 AM, Sadananda Tripathy <tripathy@xxxxxxxxxxx> wrote:

Hi All,
 I need a help regarding the following issue,
I have a small pool having
No slots: 200
OS of all nodes: RHEL 6
Condor version: 7.6.6
Condor universe:  vanilla
Shared file system (using NFS)
We submit batch jobs through scripts and it generates output files on a NFS filesystem (which is mounted at a particular path on every machine).
And also get Err and output dump on NFS filesystem.
 
Recently we upgrade our Submitter machine from condor 7.6.6 to 8.0.2, after that we don’t get any data in the err and output file. Instead of that a  “_condor_stderr” and “_condor_stdout” file is created which contains all the error and output dump.
 
Please any one can suggest me what configuration should I do, so that on time error and output dump of a running job should be captured in the specified files mentioned in submit description file.
 
The sample submit description file is as below.
 
TOPDIR = /delsoft/condor/test
Executable = $(TOPDIR)/condor_test.csh
universe = vanilla
requirements = OpSys == "LINUX"
rank = Memory >= 64
image_size = 28000
output = result/loop_out.$(Process)
error = result/loop_error.$(Process)
log = result/loop.log
request_memory = 1 GB
initialdir = $(TOPDIR)
queue 3
 
 
please note on all machine’s condor config file following variables are set.
SOFT_UID_DOMAIN         = TRUE
UID_DOMAIN              = noida.atrenta.com
FILESYSTEM_DOMAIN       = noida.atrenta.com
DEFAULT_DOMAIN_NAME             = noida.atrenta.com
 
Please let me know if you have any solution to the above issue.

Either of these should fix the problem:

* Add ‘should_transfer_files = NO’ to the submit file.

* Upgrade all nodes to HTCondor 8.0.x.

Thanks and regards,
Jaime Frey
UW-Madison HTCondor Project