[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] MPI : strange globus error, though not using globus



OK, forget all this....

This has nothing to do with MPI.

The condor master that started was the correct one, but all other daemons (schedd, startd, etc...) were the "old" ones, because the global condor_config file still had the bad value.

It's WE time now, Bye :)
Nicolas


----------------
On Fri, 2 Feb 2007 11:59:57 +0100
Nicolas GUIOT <nicolas.guiot@xxxxxxx> wrote:

> Details : I just figured out that this happens for every job I try to submit to my pool, even non-MPI.
> 
> It seems to be an authentication problem, but I don't understand I never had this before. What I recently changed on my pool is that I had te replace the Hard Disk of the central manager, and so to setup the computer again, but I had all data on backups, and everything should be exactly the same...
> 
> I'm still using the same NIS/NFS installation, users can still login to each computer, with their ~HOME/ correctly setup...
> 
> Any idea of what I forgot ?
> 
> Nicolas
> 
> ----------------
> On Thu, 1 Feb 2007 16:02:56 +0100
> Nicolas GUIOT <nicolas.guiot@xxxxxxx> wrote:
> 
> > Hi
> > 
> > (FYI, I'm setting up the parallel applications, sorry to flood the list today...)
> > 
> > So, I setup a dedicated scheduler, and 2 dedicated resources. This is all on a private LAN, nothing to do with globus, condor-g or any other stuff to link my pool to another.
> > 
> > And now, When I'm submitting my MPI job, I get the following errors : 
> > 
> > $ condor_submit CondorMpiTest.cmd
> > Submitting job(s)
> > ERROR: Failed to connect to local queue manager
> > AUTHENTICATE:1003:Failed to authenticate with any method
> > AUTHENTICATE:1004:Failed to authenticate using GSI
> > GSI:5003:Failed to authenticate.  Globus is reporting error (851968:45).  There is probably a problem with your credentials.  (Did you run grid-proxy-init?)
> > AUTHENTICATE:1004:Failed to authenticate using KERBEROS
> > AUTHENTICATE:1004:Failed to authenticate using FS
> > 
> > $ ps ax|grep cond
> >  7602 ?        Ss     0:02 /nfs/opt/condor_i686/sbin/condor_master
> >  7603 ?        Ss     0:00 condor_schedd -f
> >  8120 pts/0    S+     0:00 tail -f /scratch/condor/log/SchedLog
> >  8129 pts/1    S+     0:00 grep cond
> > 
> > $ condor_q
> > -- Submitter: seurat.lbt.ibpc.fr : <172.27.xx.xx:32795> : seurat.my.domain.fr
> >  ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
> > 0 jobs; 0 idle, 0 running, 0 held
> > ##################################
> > 
> > -  And I have in the SchedLog : 
> > 
> > 2/1 15:39:08 (pid:7603) authenticate_self_gss: acquiring self credentials failed. Please check your Condor configuration file if this is a server process. Or the user environment variable if this is a user process.
> > 
> > GSS Major Status: General failure
> > GSS Minor Status Error Chain:
> > globus_gsi_gssapi: Error with GSI credential
> > globus_gsi_gssapi: Error with gss credential handle
> > globus_credential: Valid credentials could not be found in any of the possible locations specified by thecredential search order.
> > Valid credentials could not be found in any of the possible locations specified by the credential search order.
> > 
> > Attempt 1
> > 
> > globus_credential: Error reading host credential
> > globus_sysconfig: Could not find a valid certificate file: The host cert could not be found in:
> > 1) env. var. X509_USER_CERT
> > 2) /etc/grid-security/hostcert.pem
> > 3) $GLOBUS_LOCATION/etc/hostcert.pem
> > 4) $HOME/.globus/hostcert.pem
> > 
> > The host key could not be found in:
> > 1) env. var. X509_USER_KEY
> > 2) /etc/grid-security/hostkey.pem
> > 3) $GLOBUS_LOCATION/etc/hostkey.pem
> > 4) $HOME/.globus/hostkey.pem
> > 
> > 
> > 
> > Attempt 2
> > 
> > globus_credential: Error reading proxy credential
> > globus_sysconfig: Could not find a valid proxy certificate file location
> > globus_sysconfig: Error with key filename
> > globus_sysconfig: File does not exist: /tmp/x509up_u0 is not a valid file
> > 
> > Attempt 3
> > 
> > globus_credential: Error reading user credential
> > globus_sysconfig: Error with certificate filename: The user cert could not be found in:
> > 1) env. var. X509_USER_CERT
> > 2) $HOME/.globus/usercert.pem
> > 3) $HOME/.globus/usercred.p12
> > 
> > 
> > 
> > 
> > 2/1 15:39:09 (pid:7603) AUTHENTICATE: no available authentication methods succeeded, failing!
> > 2/1 15:39:09 (pid:7603) SCHEDD: authentication failed: AUTHENTICATE:1003:Failed to authenticate with any method|AUTHENTICATE:1004:Failed to authenticate using GSI|GSI:5003:Failed to authenticate.  Globus is reporting error (851968:133).  There is probably a problem with your credentials.  (Did you run grid-proxy-init?)|AUTHENTICATE:1004:Failed to authenticate using KERBEROS|AUTHENTICATE:1004:Failed to authenticate usingFS
> > 2/1 15:39:09 (pid:7603) IO: Failed to read packet header
> > 2/1 15:39:25 (pid:7603) IO: Failed to read packet header
> > 
> > #####################################
> > 
> > So, what does this globus/grid/prixy error come to do here ?
> > 
> > What did I miss ?
> > 
> > Nicolas
> 
> 
> ----------------------------------------------------
> CNRS - UPR 9080 : Laboratoire de Biochimie Theorique
> Institut de Biologie Physico-Chimique
> 13 rue Pierre et Marie Curie
> 75005 PARIS - FRANCE
> 
> Tel : +33 158 41 51 70
> Fax : +33 158 41 50 26
> ----------------------------------------------------
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at either
> https://lists.cs.wisc.edu/archive/condor-users/
> http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
> 

----------


----------------------------------------------------
CNRS - UPR 9080 : Laboratoire de Biochimie Theorique
Institut de Biologie Physico-Chimique
13 rue Pierre et Marie Curie
75005 PARIS - FRANCE

Tel : +33 158 41 51 70
Fax : +33 158 41 50 26
----------------------------------------------------