[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor-reuse-vm2 Job Owner in Windows



Ian, Tammy,

I added the condor_pool password, but still no luck running.
In fact the CredLog itself hasn't been altered for weeks, so
I'm not sure if that's the file I should be looking at or if
credentials are even being checked at all.

Before pursuing this further, can I just explain the setup I'm trying to
use?  
I think what I'm trying to do should be really simple for condor, 
and maybe it's just getting too complicated, when it doesn't need to be.

I'm a software developer of a prototype web services application that
requires
a job scheduler to manage a set of compute intensive tasks, farmed out to a 
pool of machines so they can be running in parallel, and return their
results
to that application, all in a single work session.  This application uses
condor 
as that job scheduler, where the application generates a set
of condor_submit commands.  Eventually this application will
be packaged up and installed elsewhere. In its final installation,
it will be used in a closed environment, where condor will be used only
for this application, not others.  And the machines on which it is 
installed will be dedicated for this purpose.  This will be running
in a classified environment.

The system was developed and runs successfully under linux, 
where a front end 'condor master' and a linux cluster of 'condor slaves'
is used.  In this case, the master and each node in the cluster has 
a version of condor installed.  Only the nodes (each with four virtual
machines) are available for jobs to be started,  
(Note that during the prototyping process, the frontend machine 
(a dual processor running linux) was all that was
available, so it was set up to be allowed to start jobs).

However, we had to port this whole operation to Windows XP.
And I'm fairly new to Windows.

In this process, my current development configuration consists of
a single Windows XP machine (dual processor) (userdomain: winxp-dev-01), 
on which I am currently the only user ('diane'), and on which my web
application and condor are installed.  Note that in the final
installation, the intention is to allow other users access to that machine 
(the condor_master), with condor installed on a cluster of 
windows machines (the condor_slaves).

So for this new prototype running under windows, I installed Condor on 
my single machine (I downloaded the msi file and
just ran the condor windows installer) 
and set it up to start automatically at boot.  So  
my single machine should be the only machine in the pool.

Condor does work when installed this way.  The jobs show
up as owner 'diane' in the condor queue, and the actual
jobs (sub processes) started by the 'executable' show up as running as 
'condor_reuse-vmX when viewed in the Task manager.

*******
HOWEVER, my particular web application requires that one
of the sub processes started by the condor job that it spawns 
be run as a specific user (namely 'diane'), 
and not user 'condor-reuse-vmX' or SYSTEM.
*******

Therefore, after installation, I altered the condor_config file to 
Include the following lines (directly as shown):

CREDD_HOST = $(CONDOR_HOST):$(CREDD_PORT)
CREDD_CACHE_LOCALLY = True
QUEUE_ALL_USERS_TRUSTED = True
STARTER_ALLOW_RUNAS_OWNER = True
HOSTALLOW_CONFIG = winxp-dev-01

I also ran:
	 condor_store_cred add -c -p condor_pool
which seemed to work (told me it was successful)
and
	condor_store_cred add 
to add credentials for 'diane@winxp-dev-01'
which also worked.

I then rebooted to restart condor.

I then altered my  condor_submit command to include
	-name winxp-dev-01
And altered my condor.submit file to include:
	+Owner = "diane"
	run_as_owner = True

And now, as I said, the job shows up in the condor queue (as owner diane)
But just hangs there (is Idle).  Note, when I remove the 
      +Owner and run_as_owner lines, 
the job starts, but then eventually fails on one of its subprocesses 
(because it is being run as 'condor-reuse-vmX' (or 'condor-reuse-slot1' 
for ver 6.9) and not 'diane').

You mention checking the StartdLog. Where is that?  I have a StartLog
But that and the ShadowLog have no new entries associated with the job
submission.

Anyway, I hope the above explanation makes it clearer about 
what I should be doing.

Any help would be greatly appreciated.

Thanks,
Diane



-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Ian Chesal
Sent: Wednesday, October 24, 2007 6:06 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] condor-reuse-vm2 Job Owner in Windows

> Thanks for the info sites.  They were very helpful.

I have to confess that we don't use Condor's credential daemon. We run
all our jobs as fixed domain accounts on our Windows boxes. It was
easier. :)
 
> When I make changes to the condor configuration,
> as described in the documentation (I think), restart condor, 
> and then submit the job, the job now hangs 
> in the queue (and doesn't even get started).

Tammy already suggested restarting everything (even the condor_master
processes) on all your machines. I'd start with that. And also make sure
you've stored your password with the credd daemon using
condor_store_cred.

It looks like the match is being rejected. Did the job even try to run?
Check the ShadowLog on the machine running condor_schedd to see if the
job perhaps tried to execute on the machine but couldn't run. Also check
the StartdLog on the machine where you're trying to run the job. To make
debugging this easier I'd target your job to one specific machine and
maybe set that machine to only run your jobs (if the queue is not
exclusivily yours). Let us know if that helps at all.

- Ian


Confidentiality Notice.  This message may contain information that is
confidential or otherwise protected from disclosure.
If you are not the intended recipient, you are hereby notified that any use,
disclosure, dissemination, distribution, 
or copying of this message, or any attachments, is strictly prohibited.  If
you have received this message in error, 
please advise the sender by reply e-mail, and delete the message and any
attachments.  Thank you.




_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: 
https://lists.cs.wisc.edu/archive/condor-users/