[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor jobs



Thanks Garrett, 

Examining the StarterLog for an executing slot showed that there was a
mismatch in the UID domain of the submitting machine. Adding
TRUST_UID_DOMAIN=TRUE to the local configs (both submit & execute) and
restarting condor solved it.
It works in a directory that's 755 for me and without run_as_owner =
True in the submit file. 
Thanks for the tip.

Steve

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Koller, Garrett
Sent: 11 August 2011 15:34
To: Condor-Users Mail List
Subject: Re: [Condor-users] condor jobs

Sunil,

That is odd.  Perhaps it is a bug in your specific version of Condor,
but I kinda doubt it.  What output do you get when you run 'condor -v'
(version number and platform)?  I have not upgraded to 7.6.2 yet (I'm
running 7.6.1) so I don't know if I would have the same problem as Steve
and possibly you (if it's a bug in 7.6.2 and you're using that version).
Assuming it's not a bug, try looking at the Starter log of the slot
where the job is executing, presumably on 'slot1@xxxxxxxxxxx' (the log
would be called "StarterLog.slot1" in the Log folder of 'cdstdu.edu').
My guess is that the starter is running into an error where you're
submitting your job as a username (on 20.1.1.1) that doesn't exist on
'cdstdu.edu'.  If so, the StarterLog will indicate that it is exiting
for that reason, although I'm not sure if that would stop it from
writing to the standard out...

Best Regards,
 - Garrett

________________________________________
From: condor-users-bounces@xxxxxxxxxxx
[condor-users-bounces@xxxxxxxxxxx] on behalf of Steven Platt
[Steven.Platt@xxxxxxxxxx]
Sent: Thursday, August 11, 2011 6:58 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] condor jobs

Following this with interest as I'm getting the exactly same when
testing an upgrade to 7.6.2.
STARTER_ALLOW_RUNAS_OWNER = TRUE in condor_config.local on both
submitting & execute machines (confirmed by condor_config_val
starter_allow_runas_owner -verbose), and run_as_owner = True in submit
file. Also tried with these settings and submitting from a directory
with 777 permissions, but no joy.

This wasn't a problem in our old 7.0.5 installation.

Cheers
Steve
________________________________________
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Sunil M. Dogra
Sent: 11 August 2011 11:27
To: Condor-Users Mail List
Subject: Re: [Condor-users] condor jobs

Thank you Garrett,
I changed the STARTER_ALLOW_RUNAS_OWNER =TRUE and also the output path
to /tmp/test.out....
But Still see in test.log

000 (009.000.000) 08/11 15:53:55 Job submitted from host:
<20.1.1.1:49856>
...
007 (009.000.000) 08/11 15:53:56 Shadow exception!
        Error from slot1@xxxxxxxxxxx: Failed to open '/tmp/test.out' as
standard output: Permission denied (errno 13)
        0  -  Run Bytes Sent By Job
        0  -  Run Bytes Received By Job
...
012 (009.000.000) 08/11 15:53:56 Job was held.
        Error from slot1@xxxxxxxxxxx: Failed to open '/tmp/test.out' as
standard output: Permission denied (errno 13)
        Code 7 Subcode 13




Aslo I changed /tmp permission to 777

With Best Regards
sunil
On Wed, Aug 10, 2011 at 6:16 PM, Koller, Garrett
<kollerg14@xxxxxxxxxxxx> wrote:
Sunil,

I think I've come across this before.  The problem is that the job is
being run as the low-permissions user "nobody".  To fix this, set
"run_as_owner = True" in your submit file (STARTER_ALLOW_RUNAS_OWNER
must be set to TRUE in the Condor configuration file).  Alternatively,
if this isn't a good option, you can place the output files in a folder
that is global readable and writable so that even the user "nobody"
which has no permissions will be able to write files to it, except that
you probably DON'T want to do this with your home directory, obviously.
If you're just testing, try saving the output as "/tmp/test.out", since
that folder is always writable to anyone, but keep in mind that
everything in "/tmp/" is deleted at shutdown.

Best Regards,
 ~ Garrett

On Aug 10, 2011, at 6:25 AM, Sunil M. Dogra wrote:

Hi,
I submit one test jobs and got following errer,

Thank you
With Best Regards
sunil

000 (005.000.000) 06/10 12:17:49 Job submitted from host:
<20.1.1.1:33324>
...
007 (005.000.000) 06/10 12:37:52 Shadow exception!
        Error from slot1@xxxxxxxxxxx: Failed to open
'/home/sunil/test.out' as standard output: Permission denied (errno 13)
        0  -  Run Bytes Sent By Job
        0  -  Run Bytes Received By Job
...
012 (005.000.000) 06/10 12:37:52 Job was held.
        Error from slot1@xxxxxxxxxxx: Failed to open
'/home/sunil/test.out' as standard output: Permission denied (errno 13)
        Code 7 Subcode 13
...
009 (005.000.000) 08/01 12:33:56 Job was aborted by the user.
        via condor_rm (by user condor)
...
000 (006.000.000) 08/10 15:51:37 Job submitted from host:
<20.1.1.1:38120>
...
007 (006.000.000) 08/10 15:51:49 Shadow exception!
        Error from slot1@xxxxxxxxxxx: Failed to open
'/home/sunil/test.out' as standard output: Permission denied (errno 13)
        0  -  Run Bytes Sent By Job
        0  -  Run Bytes Received By Job
...
012 (006.000.000) 08/10 15:51:49 Job was held.
        Error from slot1@xxxxxxxxxxx: Failed to open
'/home/sunil/test.out' as standard output: Permission denied (errno 13)
        Code 7 Subcode 13
_______________________________________________

-----------------------------------------
************************************************************************
**
The information contained in the EMail and any attachments is
confidential and intended solely and for the attention and use of
the named addressee(s). It may not be disclosed to any other person
without the express authority of the HPA, or the intended
recipient, or both. If you are not the intended recipient, you must
not disclose, copy, distribute or retain this message or any part
of it. This footnote also confirms that this EMail has been swept
for computer viruses, but please re-sweep any attachments before
opening or saving. HTTP://www.HPA.org.uk
************************************************************************
**
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
-----------------------------------------
**************************************************************************
The information contained in the EMail and any attachments is
confidential and intended solely and for the attention and use of
the named addressee(s). It may not be disclosed to any other person
without the express authority of the HPA, or the intended
recipient, or both. If you are not the intended recipient, you must
not disclose, copy, distribute or retain this message or any part
of it. This footnote also confirms that this EMail has been swept
for computer viruses, but please re-sweep any attachments before
opening or saving. HTTP://www.HPA.org.uk
**************************************************************************