[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Directory creation problem with 7.4.1 on Fedora 12?



On Mon, Aug 2, 2010 at 6:10 PM, Timothy St. Clair <tstclair@xxxxxxxxxx> wrote:
> When you submit what happens when you set run_as_owner=false
>
> There is a possibility there could be a credz issue.
>
> Cheers,
> Tim
>

Tim,

The problem was actually that some of the dependencies of the
particular binary used for these jobs were not met on the execute node
(i386 versions of libraries).  The error message was hence rather
misleading.

Thanks for looking.

Cheers,
Adam

> On Mon, 2010-08-02 at 16:27 +0100, Adam Huffman wrote:
>> On Fri, Mar 26, 2010 at 3:25 PM, Timothy St. Clair <tstclair@xxxxxxxxxx> wrote:
>> > I do not see anything out of the ordinary in your condor_config file,
>> > other then you should no longer need the HOSTALLOW params in your config
>> > file, they have been replaced ALLOW.
>> >
>> > Have you made any changes to your condor_config.local file?  If so that
>> > would also be useful, if not, then you may want change ALL_DEBUG =
>> > D_FULLDEBUG to get more verbose logging which may help to correctly
>> > identify the issue.
>> >
>>
>> Finally had time to get back to this.  I'm seeing exactly the same
>> behaviour on a brand new Fedora 13 box, with Condor 7.4.2 installed
>> from the Fedora repository.  The only changes I made to
>> condor_config.local were to remove the unnecessary daemons (it's just
>> an execute node).
>>
>> I have logs with that debug setting, so let me know which you'd like to see.
>>
>> Adam
>>
>> > Cheers,
>> > Tim
>> >
>> >
>> > On Wed, 2010-03-24 at 17:02 +0000, Adam Huffman wrote:
>> >> On Wed, Mar 24, 2010 at 3:47 PM, Timothy St. Clair <tstclair@xxxxxxxxxx> wrote:
>> >> > Couple of questions:
>> >> >
>> >> >        1.) How was condor started? (as root or as a user?)
>> >>
>> >> It's set to start automatically and when testing I run
>> >>
>> >> service condor restart
>> >>
>> >> etc.  It appears to run condor_master as the 'condor' user, which has
>> >> its home directory as /var/lib/condor
>> >>
>> >> >        2.) Could you post your condor_config?
>> >> >
>> >>
>> >> Attached.  I've removed a couple of the local host ranges from the
>> >> ALLOW directives.  I added the old-style HOSTALLOW in case that made a
>> >> difference, which it didn't.
>> >>
>> >> Cheer,s
>> >> Adam
>> >>
>> >> > Cheers,
>> >> > Tim
>> >> >
>> >> >
>> >> > On Wed, 2010-03-24 at 11:11 +0000, Adam Huffman wrote:
>> >> >> I've installed 7.4.1 from the Fedora repositories on a brand new
>> >> >> Fedora 12 machine.
>> >> >>
>> >> >> Whenever I submit jobs to that machine, the jobs move to the Hold
>> >> >> state.  Here's an example error message on the central manager:
>> >> >>
>> >> >> 112.016:  Request is held.
>> >> >>
>> >> >> Hold reason: Error from starter on slot21@...man.ac.uk: Failed to
>> >> >> execute '/var/lib/condor/execute/dir_2772/condor_exec.exe': No such
>> >> >> file or directory
>> >> >>
>> >> >> and on the execute host itself:
>> >> >>
>> >> >>
>> >> >> 03/24 10:58:39 Using config source: /etc/condor/condor_config
>> >> >> 03/24 10:58:39 Using local config sources:
>> >> >> 03/24 10:58:39    /var/lib/condor/condor_config.local
>> >> >> 03/24 10:58:39 DaemonCore: Command Socket at <...:44951>
>> >> >> 03/24 10:58:39 Done setting resource limits
>> >> >> 03/24 10:58:39 Communicating with shadow <...:36474>
>> >> >> 03/24 10:58:39 Submitting machine is "...man.ac.uk"
>> >> >> 03/24 10:58:39 setting the orig job name in starter
>> >> >> 03/24 10:58:39 setting the orig job iwd in starter
>> >> >> 03/24 10:58:39 File transfer completed successfully.
>> >> >> 03/24 10:58:40 Job 112.13 set to execute immediately
>> >> >> 03/24 10:58:40 Starting a VANILLA universe job with ID: 112.13
>> >> >> 03/24 10:58:40 IWD: /var/lib/condor/execute/dir_2752
>> >> >> 03/24 10:58:40 Input file:
>> >> >> /var/lib/condor/execute/dir_2752/dammin-mer_cb_clair.13.inp
>> >> >> 03/24 10:58:40 Output file:
>> >> >> /var/lib/condor/execute/dir_2752/dammin-mer_cb_clair.13.out
>> >> >> 03/24 10:58:40 Error file:
>> >> >> /var/lib/condor/execute/dir_2752/dammin-mer_cb_clair.13.err
>> >> >> 03/24 10:58:40 About to exec /var/lib/condor/execute/dir_2752/condor_exec.exe
>> >> >> 03/24 10:58:40 Create_Process(/var/lib/condor/execute/dir_2752/condor_exec.exe):
>> >> >> child failed with errno 2 (No such file or directory) before exec()
>> >> >> 03/24 10:58:40 ERROR
>> >> >> "Create_Process(/var/lib/condor/execute/dir_2752/condor_exec.exe,,
>> >> >> ...) failed: No such file or directory" at line 530 in file
>> >> >> os_proc.cpp
>> >> >> 03/24 10:58:40 ShutdownFast all jobs.
>> >> >>
>> >> >> In fact there are no subdirectories in /var/lib/condor/execute, so I
>> >> >> wonder whether it's having trouble creating them.  I added
>> >> >> transfer_executable = true to the submit file, even though it wasn't
>> >> >> needed before.  It didn't make any difference.
>> >> >>
>> >> >> The same version (7.4.1) is working on an older Fedora 12 machine, the
>> >> >> difference being that it may have an older version of condor_config,
>> >> >> as it's been upgraded several times.
>> >> >>
>> >> >>
>> >> >> Adam
>> >> >> _______________________________________________
>> >> >> Condor-users mailing list
>> >> >> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> >> >> subject: Unsubscribe
>> >> >> You can also unsubscribe by visiting
>> >> >> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> >> >>
>> >> >> The archives can be found at:
>> >> >> https://lists.cs.wisc.edu/archive/condor-users/
>> >> >
>> >> > _______________________________________________
>> >> > Condor-users mailing list
>> >> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> >> > subject: Unsubscribe
>> >> > You can also unsubscribe by visiting
>> >> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> >> >
>> >> > The archives can be found at:
>> >> > https://lists.cs.wisc.edu/archive/condor-users/
>> >> >
>> >> _______________________________________________
>> >> Condor-users mailing list
>> >> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> >> subject: Unsubscribe
>> >> You can also unsubscribe by visiting
>> >> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> >>
>> >> The archives can be found at:
>> >> https://lists.cs.wisc.edu/archive/condor-users/
>> >
>> > _______________________________________________
>> > Condor-users mailing list
>> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> > subject: Unsubscribe
>> > You can also unsubscribe by visiting
>> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> >
>> > The archives can be found at:
>> > https://lists.cs.wisc.edu/archive/condor-users/
>> >
>
>