[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] File stage-in fails in Condor 6.9.1 as a submitter to GT4



On Jan 23, 2007, at 6:18 PM, Gabriel Mateescu wrote:

There is a problem with file stage-in that appears in Condor-G
for Condor version 6.9.1, but did not appear in Condor 6.7.x.

Specifically, the following submit file

$ more  host.condorg
universe  = grid
grid_resource = gt4 FQDN:8443 Fork
Executable = /bin/hostname
InitialDir = /home/gabriel/SubmitCondor
Output =   host.condorg.$(Cluster).out
Error  =   host.condorg.$(Cluster).err
Log    =   host.condorg.$(Cluster).log
log_xml = True
Notification = Never
Transfer_Executable = False
when_to_transfer_output = ON_EXIT_OR_EVICT
queue


works in Condor 6.7.x with FQDN being the same host as the
central manager (i.e., Condor and Globus GRAM are co-located),
as well as being a remote host (i.e., submit to a
remote Globus resource).

However, in Condor 6.9.1, it only works when FQDN is
the local host, i.e., Condor submits to the Globus GRAM
on the same machine as the Condor central manager.
If FQDN is a a remote machine, the job is put on hold and
the following error occurs

HoldReason = "Globus error: Staging error for RSL element fileStageIn."


In both cases, in the job class-ad I see

  x509userproxy = "/tmp/x509up_u501"

where

  $ grid-proxy-info  -type -timeleft
    Proxy draft (pre-RFC) compliant impersonation proxy
    571631


I think that before submitting the job, Condor-G delegates
the X509 credential, then inserts the EPR of the delegated
credential resource in the job RSL submitted to Globus.
Has anything in the credential handling changed between
versions 6.7.x and 6.9.1 of Condor?

Using your submit file, I can successfully submit a job to a remote machine using Condor 6.9.1. Nothing has changed in how Condor delegates the job credential between 6.7.x and 6.9.1.

If you set 'GRIDMANAGER_DEBUG = D_FULLDEBUG' in your config file, you should see a large java exception stack trace in the gridmanager log for the stage-in error. It should have more information about what exactly went wrong.

+--------------------------------+-----------------------------------+
|           Jaime Frey           | I used to be a heavy gambler.     |
|       jfrey@xxxxxxxxxxx        | But now I just make mental bets.  |
| http://www.cs.wisc.edu/~jfrey/ | That's how I lost my mind.        |
+--------------------------------+-----------------------------------+