[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] credd issues: heterogenous system MAC-central; WIN-execute + EC2 (win) when this works



i'll try -debug and check CONFIG_LOG.

the docs say the pool password needs to be set on all machines. The main reason i'm setting up CRED at all is so that i can submit jobs to the WIN box from the MAC side. my understanding is that all computers need to share the pool password so their daemons can communicate. Then further i need to have the same logon/password accounts on both mac & win so from mac i can run_as_owner on the WIN box. am i understanding that correctly?

regards, jason

On Aug 25, 2011, at 9:46 AM, Timothy St. Clair wrote:

try adding -debug to the command line for the tool  

+ check the CONFIG_LOG output on windows. 

I'm not certain, but I don't believe it's allowed to try to set the pool
passwd from a non-windows machine.  I will have to run a test to
verify. 

Cheers,
Tim

On Wed, 2011-08-24 at 23:43 -0400, Jason Herman wrote:
Please, anybody!!!! help! i've been at it for days to little avail. 

Condor is running smoothly on my mac (central manager, submit, &
execute).
Conder is also running on my Windows box (submit & execute).

However i need to configure CREDD on the windows box and can't set the
pool password from the MAC side.
IS THIS A FEASIBLE CONFIGURATION? anyone have it working?

fyi, the windows machine is running in parallels on the MAC. IP
addresses and hostnames seem to be resolving fine.

I have followed the manuals on installing CREDD, password
authentication, and tried endless configurations.


**************
I can set the pool password from the Windows side:

C:\Users\Administrator>condor_store_cred add -c
Account: condor_pool@JASONHERMANB752

Enter password:

Operation succeeded.

But I can't set the pool password from the MAC side:

jimi:condor root# condor_store_cred add -c
Account: condor_pool@xxxxxxxxxxxxxxxx

Enter password: 

Operation failed.
  Make sure you have CONFIG access to the target Master.

*****************

I am really beyond wits' end with the obscurity of this! How could i
not have CONFIG 
access to the target when i included "*" on both ends??!!



**********************
WINDOWS CONFIG:

CREDD_DEBUG = D_ALL

LOCAL_CREDD = windows_hostname
CREDD_HOST = windows_hostname:$(CREDD_PORT)

CREDD_CACHE_LOCALLY = True
#
STARTER_ALLOW_RUNAS_OWNER = True
#
ALLOW_CONFIG = Administrator@*, root@*, windows_IP, mac_IP,
*@mymac_hostname, *
#SEC_CLIENT_AUTHENTICATION_METHODS = FS, NTSSPI, PASSWORD, ANONYMOUS
#SEC_CONFIG_NEGOTIATION = REQUIRED
#SEC_CONFIG_AUTHENTICATION = REQUIRED
#SEC_CONFIG_ENCRYPTION = REQUIRED
#SEC_CONFIG_INTEGRITY = REQUIRED
##

***********************
MAC CONFIG:

##
LOCAL_CREDD = windows_hostname
CREDD_HOST = windows_hostname:$(CREDD_PORT)

STARTER_ALLOW_RUNAS_OWNER = True
CREDD_CACHE_LOCALLY = True
##


## You'll also need to ensure that clients are configured to use
## PASSWORD authentication on any machine that can run jobs as the
## submitting user. For example,
###### duplicate line with below:
 SEC_CLIENT_AUTHENTICATION_METHODS = FS, NTSSPI, PASSWORD, ANONYMOUS
##
## And finally, you'll need to enable CONFIG-level access for all
## machines in the pool so that the pool password can be stored:
##


##
 ALLOW_CONFIG = mac_ip, windows_ip, Administrator@*, root@*, *
#   SEC_CONFIG_NEGOTIATION = REQUIRED
#   SEC_CONFIG_AUTHENTICATION = REQUIRED
#   SEC_CONFIG_ENCRYPTION = REQUIRED
#   SEC_CONFIG_INTEGRITY = REQUIRED
##
***************************


thanks, J




On Aug 23, 2011, at 9:46 AM, Timothy St. Clair wrote:

If your VM session exists only to run jobs, have you tried setting
your
START _expression_ to TRUE?  

You should not need a credd unless you are running as owner, which
is
not the default. 

Also your CRED_HOST *must be* a windows machine.  It may be too
early in
the a.m., but I can't discern from the logs below if that is the
case. 

Cheers,
Tim

On Thu, 2011-08-18 at 19:19 -0400, Jason Herman wrote:
hi-

Here are the machines i'm setting up:

1) Mac (intel osx) - as condor central server
2) paralles VM running Windows within the mac as execute machine
3) seperate windows desktop
4) after everthing else works: EC2 windows machines - i suppose
running as a cluster that attachs as a flock. (perhaps with
cyclecomputing)

I have tried (for days):
* playing with various configurations of condor_config &
condor_config.local on both machines.
* taken down firewalls on both sides.
* read manuals, googled, etc..
* running condor_store_cred with various setting on both sides

STATUS:
So far I have Condor up and running on the MAC as an execute,
submit, manage installation. I successfully ran a test job. The
windows execute node is up but i can't test it until i get credd
security working properly (i think that's the problem). I can
see
the windows and mac slots from the both sides (see below). 

When i submit a job from MAC that has windows requirements it
doesn't run. Presently, condor_q -analyze says "not yet been
considered by the matchmaker" and "match but reject the job for
unknown reasons." Under a previously attempted configuration it
was
"reject your job because of their own requirements" , the
Windows
slot would got to 'Matched', but the job would be Idle and the
logs
would suggest a security issue.

I can't even condor_rm the Idle jobs on the MAC side. I'm
guessing
there being matched to Windows ceded their control:
------
jimi:~ root# condor_q


-- Submitter: jimi.westell.com : <169.254.177.117:49371> :
jimi.westell.com
ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD

11.0   Jason           8/17 22:10   0+01:46:05 I  0   0.0
sample-job 60     
13.0   Jason           8/18 01:12   0+01:24:43 I  0   0.0
sample-job 60     
14.0   Jason           8/18 01:24   0+00:02:49 I  0   0.0
sample-job 60     
15.0   Jason           8/18 01:53   0+00:00:00 I  0   0.0
sample-job 60     

4 jobs; 4 idle, 0 running, 0 held

jimi:~ root# condor_rm 11.0
AUTHENTICATE:1003:Failed to authenticate with any method
No result found for job 11.0
------


CONFIGURATIONS:


-------- condor_config.local on MAC:
--------
CREDD_HOST = 10.211.55.10
STARTER_ALLOW_RUNAS_OWNER = True
CREDD_CACHE_LOCALLY = True
ALLOW_CONFIG = root@$(CONDOR_HOST), *
SEC_CONFIG_NEGOTIATION = REQUIRED
SEC_CONFIG_AUTHENTICATION = REQUIRED
SEC_CONFIG_ENCRYPTION = REQUIRED
SEC_CONFIG_INTEGRITY = REQUIRED
SEC_PASSWORD_FILE = /usr/local/condor/etc/pool_password

-------- condor_config.local on Windows:
--------
CREDD_HOST = xx.xxx.55.10
STARTER_ALLOW_RUNAS_OWNER = True
CREDD_CACHE_LOCALLY = True
SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD
ALLOW_CONFIG = *
SEC_CONFIG_NEGOTIATION = REQUIRED
SEC_CONFIG_AUTHENTICATION = REQUIRED
SEC_CONFIG_ENCRYPTION = REQUIRED
SEC_CONFIG_INTEGRITY = REQUIRED

------- condor_config on Windows
------- i made this low security just try to get it working:
-------
ALLOW_WRITE = *
ALLOW_READ = *
#... not sure what else you need to see


LOG FILES:

--------- CredLog - on windows
--------- this is after turning MAC & WIN firewalls off - not a
perm
solution, but not working anyway:
---------
08/18/11 14:42:18 Failed to start non-blocking update to
<xxx.xxx.1.21:9618>.
08/18/11 14:42:18 Return from Handler
<SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
0.0000s
08/18/11 14:47:18 Calling Handler
<SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
(2)
08/18/11 14:47:18 Return from Handler
<SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
0.0000s
08/18/11 14:47:18 Calling Handler
<SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
(2)
08/18/11 14:47:18 SECMAN: required authentication with
<xxx.xxx.1.21:9618> failed, so aborting command
UPDATE_AD_GENERIC.
08/18/11 14:47:18 ERROR: SECMAN:2004:Failed to create security
session to <xxx.xxx.1.21:9618> with TCP.
|AUTHENTICATE:1003:Failed to authenticate with any method
08/18/11 14:47:18 Failed to start non-blocking update to
<xxx.xxx.1.21:9618>.
08/18/11 14:47:18 Return from Handler
<SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
0.0000s
08/18/11 14:52:39 attempt to connect to <xxx.xxx.1.21:9618>
failed:
timed out after 20 seconds.
08/18/11 14:52:39 Calling Handler
<SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
(2)
08/18/11 14:52:39 ERROR: SECMAN:2004:Failed to create security
session to <xxx.xxx.1.21:9618> with TCP.
|SECMAN:2003:TCP connection to <xxx.xxx.1.21:9618> failed.
08/18/11 14:52:39 Failed to start non-blocking update to
<xxx.xxx.1.21:9618>.
08/18/11 14:52:39 Return from Handler
<SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC>
0.0000s

--------- MasterLog - on windows
---------
---------
08/18/11 14:51:50 condor_read(): timeout reading 21 bytes from
<10.211.55.10:53043>.
08/18/11 14:51:50 IO: Failed to read packet header
08/18/11 14:51:50 store_pool_cred: failed to receive all
parameters


COMMAND LINE OUTPUT:

---------- condor_status - on windows
---------- Manual says to run this when you are done, doesn't
mention the command 
---------- only works on the windows side:
C:\Users\Administrator>condor_status -f "%s\t" Name -f "%s\n"
ifThenElse(isUndefined(LocalCredd),\"UNDEF"\",LocalCredd)
slot1@JASONHERMANB752   UNDEF
slot1@xxxxxxxxxxxxxxxx  UNDEF
slot2@JASONHERMANB752   UNDEF
slot2@xxxxxxxxxxxxxxxx  UNDEF
slot3@xxxxxxxxxxxxxxxx  UNDEF
slot4@xxxxxxxxxxxxxxxx  UNDEF
slot5@xxxxxxxxxxxxxxxx  UNDEF
slot6@xxxxxxxxxxxxxxxx  UNDEF
slot7@xxxxxxxxxxxxxxxx  UNDEF
slot8@xxxxxxxxxxxxxxxx  UNDEF


------- condor_status - MAC (identical on windows)
-------
-------
jimi:log root# condor_status

Name               OpSys      Arch   State     Activity LoadAv
Mem
ActvtyTime

slot1@xxxxxxxxxxxx OSX        X86_64 Unclaimed Idle     0.210
1024
0+19:09:01
slot2@xxxxxxxxxxxx OSX        X86_64 Unclaimed Idle     0.000
1024
1+11:24:12
slot3@xxxxxxxxxxxx OSX        X86_64 Unclaimed Idle     0.000
1024
1+03:18:37
slot4@xxxxxxxxxxxx OSX        X86_64 Unclaimed Idle     0.000
1024
0+23:14:03
slot5@xxxxxxxxxxxx OSX        X86_64 Unclaimed Idle     0.000
1024
0+15:05:52
slot6@xxxxxxxxxxxx OSX        X86_64 Unclaimed Idle     0.000
1024
0+11:04:54
slot7@xxxxxxxxxxxx OSX        X86_64 Unclaimed Idle     0.000
1024
0+06:59:54
slot8@xxxxxxxxxxxx OSX        X86_64 Unclaimed Idle     0.000
1024
1+15:27:42
slot1@JASONHERMANB WINNT60    INTEL  Unclaimed Idle     0.120
1023
0+00:00:04
slot2@JASONHERMANB WINNT60    INTEL  Unclaimed Idle     0.100
1023
0+00:00:02
                 Total Owner Claimed Unclaimed Matched
Preempting
Backfill

   INTEL/WINNT60     2     0       0         2       0
      0        0
      X86_64/OSX     8     0       0         8       0
      0        0

           Total    10     0       0        10       0
      0        0


-------- condor_store_cred on Windows:
--------
--------
C:\Users\Administrator>condor_store_cred -c add
Account: condor_pool@JASONHERMANB752

Enter password:

Operation failed.
Make sure you have CONFIG access to the target Master.


thanks kindly for any assistance, jason



_______________________________________________
Condor-users mailing list
To unsubscribe, send a message
to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message
to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/