Re: [Condor-users] condor_credd issues

I was having a lot of problems with the CRED in the past and UW made a lot of changes to this over the last year (your early versions do not have all the updates, but you should be ok with regard to this problem--we are running 7.6.1 still).

I have not had your problem (our credentials also change quarterly) but since you can only see the host listed when you run condor_status my guess is that there is a problem with permissions for communication btw Condor nodes and the central manager.

Check that the pool password is stored and first tackle that you can't see all the machines when you run condor_status, because this seems to be the underlining problem. Make sure the global config has the proper security settings and communication btw machines are allowed.
Look for a CRED dump file on the CRED server in the Condor log directory.
If you have not done so already, you could restart the condor service on the central manager and make sure the CRED service is not crashing. My guess is that the CRED is ok though.


From: Eric Abel <Eric.Abel@xxxxxxxxxx>
To: Condor-Users Mail List <condor-users@xxxxxxxxxxx>
Date: 02/17/2012 12:47 PM
Subject: [Condor-users] condor_credd issues
Sent by: condor-users-bounces@xxxxxxxxxxx

Fellow Condor users,

I have been wrestling with what appears to be a condor_credd problem for about 2 days now.  I have a windows pool of about 160 cpus, and it has been working more or less problem free for about a 9 months.  Our IT polity is to change domain passwords every quarter, and in the past I have done this without any trouble.  However, this most recent time, after resetting my password, I could no longer submit jobs (no password stored for user error on schedd machine).  When I try

condor_store_cred add

I get the error:

Operation failed.  Make sure your ALLOW_WRITE setting includes this host.

This is not a new problem, and I have followed all of the suggestions from previous posts multiple times.  In my case, nothing I do will allow me to set the password, and furthermore, I cannot set it on any machine in the pool, including the central host and credd host.  This problem is confounded further by the fact that when I run condor_status, I only see the central host listed (different issue, but simultaneity of occurrence makes me think both problems have same root cause?).  Anyway, I have spent a long time combing log files and editing config files only to get the same result over and over again.  I am running condor 7.6.6 on central host, credd host, and schedd host, and the execute nodes are a mixture of 7.6.1-7.6.6 installations.  Any suggestions would be greatly appreciated.



