[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Directory creation problem with 7.4.1 on Fedora 12?



On Wed, Mar 24, 2010 at 3:47 PM, Timothy St. Clair <tstclair@xxxxxxxxxx> wrote:
> Couple of questions:
>
>        1.) How was condor started? (as root or as a user?)

It's set to start automatically and when testing I run

service condor restart

etc.  It appears to run condor_master as the 'condor' user, which has
its home directory as /var/lib/condor

>        2.) Could you post your condor_config?
>

Attached.  I've removed a couple of the local host ranges from the
ALLOW directives.  I added the old-style HOSTALLOW in case that made a
difference, which it didn't.

Cheer,s
Adam

> Cheers,
> Tim
>
>
> On Wed, 2010-03-24 at 11:11 +0000, Adam Huffman wrote:
>> I've installed 7.4.1 from the Fedora repositories on a brand new
>> Fedora 12 machine.
>>
>> Whenever I submit jobs to that machine, the jobs move to the Hold
>> state.  Here's an example error message on the central manager:
>>
>> 112.016:  Request is held.
>>
>> Hold reason: Error from starter on slot21@...man.ac.uk: Failed to
>> execute '/var/lib/condor/execute/dir_2772/condor_exec.exe': No such
>> file or directory
>>
>> and on the execute host itself:
>>
>>
>> 03/24 10:58:39 Using config source: /etc/condor/condor_config
>> 03/24 10:58:39 Using local config sources:
>> 03/24 10:58:39    /var/lib/condor/condor_config.local
>> 03/24 10:58:39 DaemonCore: Command Socket at <...:44951>
>> 03/24 10:58:39 Done setting resource limits
>> 03/24 10:58:39 Communicating with shadow <...:36474>
>> 03/24 10:58:39 Submitting machine is "...man.ac.uk"
>> 03/24 10:58:39 setting the orig job name in starter
>> 03/24 10:58:39 setting the orig job iwd in starter
>> 03/24 10:58:39 File transfer completed successfully.
>> 03/24 10:58:40 Job 112.13 set to execute immediately
>> 03/24 10:58:40 Starting a VANILLA universe job with ID: 112.13
>> 03/24 10:58:40 IWD: /var/lib/condor/execute/dir_2752
>> 03/24 10:58:40 Input file:
>> /var/lib/condor/execute/dir_2752/dammin-mer_cb_clair.13.inp
>> 03/24 10:58:40 Output file:
>> /var/lib/condor/execute/dir_2752/dammin-mer_cb_clair.13.out
>> 03/24 10:58:40 Error file:
>> /var/lib/condor/execute/dir_2752/dammin-mer_cb_clair.13.err
>> 03/24 10:58:40 About to exec /var/lib/condor/execute/dir_2752/condor_exec.exe
>> 03/24 10:58:40 Create_Process(/var/lib/condor/execute/dir_2752/condor_exec.exe):
>> child failed with errno 2 (No such file or directory) before exec()
>> 03/24 10:58:40 ERROR
>> "Create_Process(/var/lib/condor/execute/dir_2752/condor_exec.exe,,
>> ...) failed: No such file or directory" at line 530 in file
>> os_proc.cpp
>> 03/24 10:58:40 ShutdownFast all jobs.
>>
>> In fact there are no subdirectories in /var/lib/condor/execute, so I
>> wonder whether it's having trouble creating them.  I added
>> transfer_executable = true to the submit file, even though it wasn't
>> needed before.  It didn't make any difference.
>>
>> The same version (7.4.1) is working on an older Fedora 12 machine, the
>> difference being that it may have an older version of condor_config,
>> as it's been upgraded several times.
>>
>>
>> Adam
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>

Attachment: condor_config.bz2
Description: BZip2 compressed data