[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] condor_g error when globus-job-run works



Hello All,

I'm trying to debug a problem we're having with a submit machine here but can't find any help via google / the list archive, so hopefully someone here can help!

In short, I can run globus jobs using globus directly (via globus-job-run etc) but running condor jobs leads to them quickly going held and reporting the following error message (from the job's log file):

Globus job submission failed!
Reason: 7 authentication failed: GSS Major Status: Authentication Failed GSS Minor Status Error Chain: init.c:499: globus_gss_assist_init_sec_context_async: Error during context initialization init_sec_context

In more detail, I am submitting via condor_g from a Debian sarge installation (2.4.27-2-386 kernel) running globus toolkit 4.0.1 and condor version 6.6.10 to any of a few remote resources running various versions of Linux and globus toolkits 2.4.3, 3.2.1 and 4.0.1. Simply running globus-job-run type commands works fine (directed to both the fork and pbs jobmanagers) but any job run via condor_g fails with the above error message in the local logs.

The logs on the remote machine read as follows:

Notice: 5: Authenticated globus user: /C=UK/O=eScience/OU=Cambridge/L=UCS/CN=richard bruin
Notice: 0: GRID_SECURITY_HTTP_BODY_FD=6
Notice: 5: Requested service: jobmanager-fork
Notice: 5: Authorized as local user: rbru03
Notice: 5: Authorized as local uid: 501
Notice: 5:           and local gid: 501
Notice: 0: executing /usr/local/globus/libexec/globus-job-manager
Notice: 0: GRID_SECURITY_CONTEXT_FD=9
Notice: 0: Child 10758 started
Notice: 6: globus-gatekeeper pid=10838 starting at Tue Mar  7 16:25:06 2006

Notice: 6: Got connection 128.232.232.27 at Tue Mar  7 16:25:06 2006

Failed reading length 0
GSS authentication failure
    globus_gss_assist token :3: read failure: Connection closed
Failure: GSS failed Major:01090000 Minor:00000000 Token:00000003

Failure: GSS failed Major:01090000 Minor:00000000 Token:00000003

Notice: 6: globus-gatekeeper pid=10839 starting at Tue Mar  7 16:25:06 2006

Notice: 6: Got connection 128.232.232.27 at Tue Mar  7 16:25:06 2006

Failed reading length 0
GSS authentication failure
    globus_gss_assist token :3: read failure: Connection closed
Failure: GSS failed Major:01090000 Minor:00000000 Token:00000003

Failure: GSS failed Major:01090000 Minor:00000000 Token:00000003

Does anyone have any idea what is happening here? I have other machines with near enough identical installations and they work fine, it just seems to be this one client machine!

Any help you could provide would be much appreciated, thanks in advance,

Rich

-------------------------------
Richard Bruin
PhD Student
Department of Earth Sciences
University of Cambridge
eMinerals project www.eminerals.org
rbru03@xxxxxxxxxxxxx