[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor_rm failing for one user because of credentialproblem



I can't answer your question, but it raises another issue we have.  We
had a student visit us a couple of years ago and one of his jobs is
still in the queue.  We cannot figure out how to get rid of it.  How can
it be done on a Windows pool?

Marshall                            
      
Marshall L. Buhl Jr.                       
NREL/NWTC
Voice: +1 (303) 384-6914          
Fax: +1 (303) 384-6901             


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of David A. Kotz
Sent: Monday, October 30, 2006 3:07 PM
To: Condor-Users Mail List
Subject: [Condor-users] condor_rm failing for one user because of
credentialproblem

I have one user who can submit and run jobs without any trouble, but who

cannot remove his jobs from the queue.  I can remove them with my queue 
superuser account.  My first thought was that the filesystem where it 
was attempting to write (for filesystem authentication) was full, but 
that was not the case.  Other users can submit and remove jobs.  Any 
ideas what might be causing this problem?

________________________________

User's error message:

AUTHENTICATE:1003:Failed to authenticate with any method
AUTHENTICATE:1004:Failed to authenticate using GSI
GSI:5003:Failed to authenticate.  Globus is reporting error (851968:45).

There is probably a problem with your credentials.  (Did you run 
grid-proxy-init?)
AUTHENTICATE:1004:Failed to authenticate using KERBEROS
AUTHENTICATE:1004:Failed to authenticate using FS
Couldn't find/remove all of user all's job(s).


_________________________________

$CondorVersion: 6.8.2 Oct 12 2006 $
$CondorPlatform: I386-LINUX_RHEL3 $
Linux [hostname] 2.6.17.4 #1 SMP Wed Jul 12 14:41:00 CDT 2006 i686
GNU/Linux
running on Ubuntu Dapper

_____________________________

related SchedLog entry:

10/30 10:18:58 (pid:4817) DaemonCore: Command received via TCP from host

<128.83.120.62:35604>
10/30 10:18:58 (pid:4817) DaemonCore: received command 478 
(ACT_ON_JOBS), calling handler (actOnJobs)
10/30 10:18:58 (pid:4817) authenticate_self_gss: acquiring self 
credentials failed. Please check your Condor configuration file if this 
is a server process. Or the user environment variable if this is a user 
process.

GSS Major Status: General failure
GSS Minor Status Error Chain:
globus_gsi_gssapi: Error with GSI credential
globus_gsi_gssapi: Error with gss credential handle
globus_credential: Valid credentials could not be found in any of the 
possible locations specified by the credential search order.
Valid credentials could not be found in any of the possible locations 
specified by the credential search order.

Attempt 1

globus_credential: Error reading host credential
globus_sysconfig: Could not find a valid certificate file: The host cert

could not be found in:
1) env. var. X509_USER_CERT
2) /etc/grid-security/hostcert.pem
3) $GLOBUS_LOCATION/etc/hostcert.pem
4) $HOME/.globus/hostcert.pem

The host key could not be found in:
1) env. var. X509_USER_KEY
2) /etc/grid-security/hostkey.pem
3) $GLOBUS_LOCATION/etc/hostkey.pem
4) $HOME/.globus/hostkey.pem



Attempt 2

globus_credential: Error reading proxy credential
globus_sysconfig: Could not find a valid proxy certificate file location
globus_sysconfig: Error with key filename
globus_sysconfig: File does not exist: /tmp/x509up_u0 is not a valid
file

Attempt 3

globus_credential: Error reading user credential
globus_sysconfig: Error with certificate filename: The user cert could 
not be found in:
1) env. var. X509_USER_CERT
2) $HOME/.globus/usercert.pem
3) $HOME/.globus/usercred.p12




10/30 10:18:58 (pid:4817) AUTHENTICATE: no available authentication 
methods succeeded, failing!
10/30 10:18:58 (pid:4817) actOnJobs() aborting: SCHEDD:4001:Failed to 
act on jobs - Authentication failed|AUTHENTICATE:1003:Failed to 
authenticate with any method|AUTHENTICATE:1004:Failed to authenticate 
using GSI|GSI:5003:Failed to authenticate.  Globus is reporting error 
(851968:45).  There is probably a problem with your credentials.  (Did 
you run grid-proxy-init?)|AUTHENTICATE:1004:Failed to authenticate using

KERBEROS|AUTHENTICATE:1004:Failed to authenticate using FS|FS:1005:Bad 
attributes on (/tmp/FS_XXXSAH3Uf)
10/30 10:18:58 (pid:4817) condor_write(): Socket closed when trying to 
write 13 bytes to <[IP]:35604>, fd is 11
10/30 10:18:58 (pid:4817) Buf::write(): condor_write() failed
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR