[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor jobs



Sunil,

That is odd.  Perhaps it is a bug in your specific version of Condor, but I kinda doubt it.  What output do you get when you run 'condor -v' (version number and platform)?  I have not upgraded to 7.6.2 yet (I'm running 7.6.1) so I don't know if I would have the same problem as Steve and possibly you (if it's a bug in 7.6.2 and you're using that version).
Assuming it's not a bug, try looking at the Starter log of the slot where the job is executing, presumably on 'slot1@xxxxxxxxxxx' (the log would be called "StarterLog.slot1" in the Log folder of 'cdstdu.edu').  My guess is that the starter is running into an error where you're submitting your job as a username (on 20.1.1.1) that doesn't exist on 'cdstdu.edu'.  If so, the StarterLog will indicate that it is exiting for that reason, although I'm not sure if that would stop it from writing to the standard out...

Best Regards,
 - Garrett

________________________________________
From: condor-users-bounces@xxxxxxxxxxx [condor-users-bounces@xxxxxxxxxxx] on behalf of Steven Platt [Steven.Platt@xxxxxxxxxx]
Sent: Thursday, August 11, 2011 6:58 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] condor jobs

Following this with interest as I'm getting the exactly same when testing an upgrade to 7.6.2.
STARTER_ALLOW_RUNAS_OWNER = TRUE in condor_config.local on both submitting & execute machines (confirmed by condor_config_val starter_allow_runas_owner -verbose), and run_as_owner = True in submit file. Also tried with these settings and submitting from a directory with 777 permissions, but no joy.

This wasn't a problem in our old 7.0.5 installation.

Cheers
Steve
________________________________________
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Sunil M. Dogra
Sent: 11 August 2011 11:27
To: Condor-Users Mail List
Subject: Re: [Condor-users] condor jobs

Thank you Garrett,
I changed the STARTER_ALLOW_RUNAS_OWNER =TRUE and also the output path to /tmp/test.out....
But Still see in test.log

000 (009.000.000) 08/11 15:53:55 Job submitted from host: <20.1.1.1:49856>
...
007 (009.000.000) 08/11 15:53:56 Shadow exception!
        Error from slot1@xxxxxxxxxxx: Failed to open '/tmp/test.out' as standard output: Permission denied (errno 13)
        0  -  Run Bytes Sent By Job
        0  -  Run Bytes Received By Job
...
012 (009.000.000) 08/11 15:53:56 Job was held.
        Error from slot1@xxxxxxxxxxx: Failed to open '/tmp/test.out' as standard output: Permission denied (errno 13)
        Code 7 Subcode 13




Aslo I changed /tmp permission to 777

With Best Regards
sunil
On Wed, Aug 10, 2011 at 6:16 PM, Koller, Garrett <kollerg14@xxxxxxxxxxxx> wrote:
Sunil,

I think I've come across this before.  The problem is that the job is being run as the low-permissions user "nobody".  To fix this, set "run_as_owner = True" in your submit file (STARTER_ALLOW_RUNAS_OWNER must be set to TRUE in the Condor configuration file).  Alternatively, if this isn't a good option, you can place the output files in a folder that is global readable and writable so that even the user "nobody" which has no permissions will be able to write files to it, except that you probably DON'T want to do this with your home directory, obviously.  If you're just testing, try saving the output as "/tmp/test.out", since that folder is always writable to anyone, but keep in mind that everything in "/tmp/" is deleted at shutdown.

Best Regards,
 ~ Garrett

On Aug 10, 2011, at 6:25 AM, Sunil M. Dogra wrote:

Hi,
I submit one test jobs and got following errer,

Thank you
With Best Regards
sunil

000 (005.000.000) 06/10 12:17:49 Job submitted from host: <20.1.1.1:33324>
...
007 (005.000.000) 06/10 12:37:52 Shadow exception!
        Error from slot1@xxxxxxxxxxx: Failed to open '/home/sunil/test.out' as standard output: Permission denied (errno 13)
        0  -  Run Bytes Sent By Job
        0  -  Run Bytes Received By Job
...
012 (005.000.000) 06/10 12:37:52 Job was held.
        Error from slot1@xxxxxxxxxxx: Failed to open '/home/sunil/test.out' as standard output: Permission denied (errno 13)
        Code 7 Subcode 13
...
009 (005.000.000) 08/01 12:33:56 Job was aborted by the user.
        via condor_rm (by user condor)
...
000 (006.000.000) 08/10 15:51:37 Job submitted from host: <20.1.1.1:38120>
...
007 (006.000.000) 08/10 15:51:49 Shadow exception!
        Error from slot1@xxxxxxxxxxx: Failed to open '/home/sunil/test.out' as standard output: Permission denied (errno 13)
        0  -  Run Bytes Sent By Job
        0  -  Run Bytes Received By Job
...
012 (006.000.000) 08/10 15:51:49 Job was held.
        Error from slot1@xxxxxxxxxxx: Failed to open '/home/sunil/test.out' as standard output: Permission denied (errno 13)
        Code 7 Subcode 13
_______________________________________________

-----------------------------------------
**************************************************************************
The information contained in the EMail and any attachments is
confidential and intended solely and for the attention and use of
the named addressee(s). It may not be disclosed to any other person
without the express authority of the HPA, or the intended
recipient, or both. If you are not the intended recipient, you must
not disclose, copy, distribute or retain this message or any part
of it. This footnote also confirms that this EMail has been swept
for computer viruses, but please re-sweep any attachments before
opening or saving. HTTP://www.HPA.org.uk
**************************************************************************
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/