[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] "Globus error 122: could not read the job state file" from condor-g



Hi

I'm trying to submit a job, as part of dag, using condor-g to a gt2 grid-resource, my submit file is as follows

universe = grid
grid_resource = gt2 marlin.phys.uwm.edu/jobmanager-managedfork
executable = /home/ram/opt/glue/bin/LSCdataFind
arguments = --observatory $(macroobservatory) --url-type file --gps- start-time $ (macrogpsstarttime) --gps-end-time $(macrogpsendtime) --output $ (macrooutput) --
lal-cache --type $(macrotype) --match $(macromatch)
getenv = True
transfer_output_files = $(macrooutput)
when_to_transfer_output = ON_EXIT_OR_EVICT
log = /home/ram/grid/ldg_test/logs/tmpczSB_K
error = logs/datafind-$(macroobservatory)-$(macrotype)-$ (macrogpsstarttime)-$(ma
crogpsendtime)-$(cluster)-$(process).err
output = logs/datafind-$(macroobservatory)-$(macrotype)-$ (macrogpsstarttime)-$(m
acrogpsendtime)-$(cluster)-$(process).out
notification = never
queue 1

the job is put on hold with the reason: "Globus error 122: could not read the job state file". I've looked in the GridManagerLog and the other condor log files but can't find anything that sheds further light on this.

The information I've found states that this error may be caused by $GLOBUS_LOCATION/tmp being full or having too restrictive permissions, there is over 80G free and has the permissions:

drwxrwxrwt  2 root root   760 May 30 14:10 .

What else could be causing this error?

Cheers

Adam