[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] [gt-user] Wanna Help



Mehdi,
it seems like what can be seen in the SchedLog is the problem.
seems like some permissions on files should be changed from user
grid to condor, which fails. sorry, this exceeds my condor knowledge.
good luck!
Martin

Dear Martin,
Excuse me Martin, Yes, the local user-account  gets mapped to DN in
the grid-mapfile
on the condor compute nodes.

The following are condor logs.
--------------------------------------------------------------
/home/condor/hosts/Server/log/SchedLog:4/19 16:11:28 (pid:3046)
(154.135) Failed to chown
/home/condor/hosts/Server/spool/cluster154.proc135.subproc0 from 504
to 506.507.  User may run into permissions problems when fetching
sandbox.
-------------------------------------------------------------------------------------------
/home/condor/hosts/localhost002/log/StarterLog:4/19 23:00:40 Failed to
open '/home/grid/.globus/job/server.eng4.shirazu.ac.ir/30671.1176986459/stdout011'
as standard output: Permission denied (errno 13)
/home/condor/hosts/localhost002/log/StarterLog:4/19 23:00:40 Failed to
open '/home/grid/.globus/job/server.eng4.shirazu.ac.ir/30671.1176986459/stderr011'
as standard error: Permission denied (errno 13)

The following are /etc/passwd on Server and Condor compute nodes.
grid:x:504:504::/home/grid:/bin/bash
gwadmin:x:505:506::/home/gwadmin:/bin/bash
condor:x:506:507::/home/condor:/bin/bash




On 4/19/07, Martin Feller <feller@xxxxxxxxxxx> wrote:
Mehdi,
Do the Condor-logs provide more information about that?
Is the local user-account, to which your DN in the grid-mapfile
gets mapped, available on the condor compute nodes?
Martin

> Hi Martin,
> Yest It work with fork jobmanager.
>
> On 4/19/07, Martin Feller <feller@xxxxxxxxxxx> wrote:
>> Does it work with fork?
>> Martin
>>
>> > Hi,
>> >  I want to submit a job to Condor pool via Globus GRAM. I define the
>> > following RSL script. I submit my job by "globusrun  -f test2.rsl"
>> > from Server itself as a Client. My job goes to Held state. My RSL
>> > script file(test2.rsl) is:
>> > ------------------------test2.rsl-----------------------------------
>> > +
>> > (
>> &(resourceManagerContact="Server.eng4.shirazu.ac.ir/jobmanager-condor")
>> >   (count=1)
>> >   (label="subjob 0")
>> >   (environment=(GLOBUS_DUROC_SUBJOB_INDEX 0)
>> >                (LD_LIBRARY_PATH /usr/local/globus-4.0.3/lib/))
>> >   (directory="/home/grid/globusTest/GRAM/Test2")
>> >   (executable="/bin/ls")
>> >   (arguments  = "-R" "/tmp")
>> >   (stdout="lsoutput")
>> >   (stderr="lserr")
>> > )
>> > -----------------------------------------------------------------------
>> >
>> > The output of globus-condor.log file is:
>> > --------------------------
>> globus-condor.log-----------------------------
>> > <c>
>> >    <a n="MyType"><s>SubmitEvent</s></a>
>> >    <a n="EventTypeNumber"><i>0</i></a>
>> >    <a n="EventTime"><s>2007-04-18T10:47:58</s></a>
>> >    <a n="Cluster"><i>126</i></a>
>> >    <a n="Proc"><i>0</i></a>
>> >    <a n="Subproc"><i>0</i></a>
>> >    <a n="SubmitHost"><s>&lt;192.168.1.254:47104&gt;</s></a>
>> > </c>
>> > <c>
>> >    <a n="MyType"><s>SubmitEvent</s></a>
>> >    <a n="EventTypeNumber"><i>0</i></a>
>> >    <a n="EventTime"><s>2007-04-18T10:47:58</s></a>
>> >    <a n="Cluster"><i>126</i></a>
>> >    <a n="Proc"><i>0</i></a>
>> >    <a n="Subproc"><i>0</i></a>
>> >    <a n="SubmitHost"><s>&lt;192.168.1.254:47104&gt;</s></a>
>> > </c>
>> > <c>
>> >    <a n="MyType"><s>ShadowExceptionEvent</s></a>
>> >    <a n="EventTypeNumber"><i>7</i></a>
>> >    <a n="EventTime"><s>2007-04-18T10:48:02</s></a>
>> >    <a n="Cluster"><i>126</i></a>
>> >    <a n="Proc"><i>0</i></a>
>> >    <a n="Subproc"><i>0</i></a>
>> >    <a n="Message"><s>Error from starter on localhost001: Failed to
>> > open
>> >
>> '/home/grid/.globus/job/server.eng4.shirazu.ac.ir/15222.1176880678/stdout'
>>
>> > as standard output: Permission denied (errno 13)</s></a>
>> >    <a n="SentBytes"><r>0.000000000000000E+00</r></a>
>> >    <a n="ReceivedBytes"><r>0.000000000000000E+00</r></a>
>> > </c>
>> > <c>
>> >    <a n="MyType"><s>JobHeldEvent</s></a>
>> >    <a n="EventTypeNumber"><i>12</i></a>
>> >    <a n="EventTime"><s>2007-04-18T10:48:02</s></a>
>> >    <a n="Cluster"><i>126</i></a>
>> >    <a n="Proc"><i>0</i></a>
>> >    <a n="Subproc"><i>0</i></a>
>> > <a n="HoldReason"><s>Error from starter on localhost001: Failed to
>> > open
>> >
>> '/home/grid/.globus/job/server.eng4.shirazu.ac.ir/15222.1176880678/stdout'
>>
>> > as standard output: Permission denied (errno 13)</s></a>
>> >    <a n="HoldReasonCode"><i>7</i></a>
>> >    <a n="HoldReasonSubCode"><i>7</i></a>
>> > </c>
>> > <c>
>> >    <a n="MyType"><s>ShadowExceptionEvent</s></a>
>> >    <a n="EventTypeNumber"><i>7</i></a>
>> >    <a n="EventTime"><s>2007-04-18T10:48:02</s></a>
>> >    <a n="Cluster"><i>126</i></a>
>> >    <a n="Proc"><i>0</i></a>
>> >    <a n="Subproc"><i>0</i></a>
>> >    <a n="Message"><s>Error from starter on localhost001: Failed to
>> > open
>> >
>> '/home/grid/.globus/job/server.eng4.shirazu.ac.ir/15222.1176880678/stdout'
>>
>> > as standard output: Permission denied (errno 13)</s></a>
>> >    <a n="SentBytes"><r>0.000000000000000E+00</r></a>
>> >    <a n="ReceivedBytes"><r>0.000000000000000E+00</r></a>
>> > </c>
>> > <c>
>> >    <a n="MyType"><s>JobHeldEvent</s></a>
>> >    <a n="EventTypeNumber"><i>12</i></a>
>> >    <a n="EventTime"><s>2007-04-18T10:48:02</s></a>
>> >    <a n="Cluster"><i>126</i></a>
>> >    <a n="Proc"><i>0</i></a>
>> >    <a n="Subproc"><i>0</i></a>
>> > <a n="HoldReason"><s>Error from starter on localhost001: Failed to
>> > open
>> >
>> '/home/grid/.globus/job/server.eng4.shirazu.ac.ir/15222.1176880678/stdout'
>>
>> > as standard output: Permission denied (errno 13)</s></a>
>> >    <a n="HoldReasonCode"><i>7</i></a>
>> >    <a n="HoldReasonSubCode"><i>7</i></a>
>> > </c>
>> > -------------------------------------------------------------------
>> >
>> > Can u please help me?
>>
>>
>
>