[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Failed to initialize GAHP



Hello Steven,
Thanks for your reply.

globusrun-ws is working fine on the remote machine.
Here is the output:

ds001 % globusrun-ws -submit -f ~kmuriki/sdsc.xml
Submitting job...Done.
Job ID: uuid:1fba6304-5d79-11dc-94e8-01000000568f
Termination time: 09/08/2007 19:32 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Done
Destroying job...Done.

And this is the sdsc.xml:

<job>
    <factoryEndpoint
          xmlns:gram="http://www.globus.org/namespaces/2004/10/gram/job";
          xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/03/addressing";>

<wsa:Address>https://tg-login1.sdsc.teragrid.org:8443/wsrf/services/ManagedJobFa
ctoryService</wsa:Address>

        <wsa:ReferenceProperties>
            <gram:ResourceID>Fork</gram:ResourceID>
        </wsa:ReferenceProperties>
    </factoryEndpoint>
    <executable>/bin/hostname</executable>
    <directory>${GLOBUS_USER_HOME}</directory>
    <stdout>sdsc.out</stdout>
    <stderr>sdsc.err</stderr>
</job>

--Mustafa

On Thursday 06 September 2007 19:26:12 Steven Timm wrote:
> Error code 3 looks like an globus error 3 on the remote host to which
> you are trying to submit, which means lack of resources, either memory
> or disk, on the remote machine.
> If you have access to the command line utility globusrun-ws, try
> to run that against the remote server and see if you get a globus error
> 3 as well, I think you will.  If so, then the problem is on the remote
> end.
> Steve Timm
>
> ------------------------------------------------------------------
> Steven C. Timm, Ph.D  (630) 840-8525
> timm@xxxxxxxx  http://home.fnal.gov/~timm/
> Fermilab Computing Division, Scientific Computing Facilities,
> Grid Facilities Department, FermiGrid Services Group, Assistant Group
> Leader.
>
> On Thu, 6 Sep 2007, Mustafa R Kilinc wrote:
> > All right. We found the gridmanager log:
> >
> > 9/6 15:05:54 passwd_cache::cache_uid(): getpwnam("condor") failed:
> > Success
> >
> > 9/6 15:05:54 passwd_cache::cache_uid(): getpwnam("condor") failed:
> > Success
> >
> > 9/6 15:05:54 ******************************************************
> > 9/6 15:05:54 ** condor_gridmanager (CONDOR_GRIDMANAGER) STARTING UP
> > 9/6 15:05:54 ** /home/mustafa/condor/sbin/condor_gridmanager
> > 9/6 15:05:54 ** $CondorVersion: 6.8.5 May 17 2007 $
> > 9/6 15:05:54 ** $CondorPlatform: I386-LINUX_RHEL3 $
> > 9/6 15:05:54 ** PID = 11012
> > 9/6 15:05:54 ** Log last touched 9/6 14:58:51
> > 9/6 15:05:54 ******************************************************
> > 9/6 15:05:54 Using config source: /home/mustafa/condor/etc/condor_config
> > 9/6 15:05:54 Using local config sources:
> > 9/6 15:05:54    /home/mustafa/condor/home/condor_config.local
> > 9/6 15:05:54 DaemonCore: Command Socket at <0.0.0.0:41874>
> > 9/6 15:05:57 [11012] DaemonCore: Command received via UDP from host
> > <127.0.0.1:33502>
> > 9/6 15:05:57 [11012] DaemonCore: received command 60000 (DC_RAISESIGNAL),
> > calling handler (HandleSigCommand())
> > 9/6 15:05:57 [11012] Found job 24.0 --- inserting
> > 9/6 15:05:57 [11012] gahp server not up yet, delaying ping
> > 9/6 15:05:57 [11012] (24.0) doEvaluateState called: gmState GM_INIT,
> > globusState 32
> > 9/6 15:05:57 [11012] GAHP server pid = 11021
> > 9/6 15:05:57 [11012] GAHP command 'GRAM_CALLBACK_ALLOW' failed: an I/O
> > operation failed error_code=3
> > 9/6 15:05:57 [11012] (24.0) Error enabling GRAM callback, err=3 - an I/O
> > operation failed
> > 9/6 15:06:02 [11012] No jobs left, shutting down
> > 9/6 15:06:02 [11012] Got SIGTERM. Performing graceful shutdown.
> > 9/6 15:06:02 [11012] **** condor_gridmanager (condor_GRIDMANAGER) EXITING
> > WITH STATUS 0
> >
> > Mustafa
> >
> > On Thursday 06 September 2007 14:56:23 Mustafa R Kilinc wrote:
> >> We do not see gridmanager logs being created in my home directory on the
> >> remote Gram server(tg-login.sdsc.teragrid.org). Do we need to look it
> >> somewhere else?
> >> We have placed SDSC CA certificate files properly in
> >> $(HOME)/.globus/certificates directory of our personal condor master.
> >>
> >> mustafa@mustafa:~$ pwd
> >> /home/mustafa
> >> mustafa@mustafa:~$ ls -al .globus/certificates/
> >> total 32
> >> drwxr-xr-x 2 mustafa mustafa  4096 2007-09-06 11:29 .
> >> drwxr-xr-x 3 mustafa mustafa  4096 2007-09-06 11:27 ..
> >> -rw-r--r-- 1 mustafa mustafa  1468 2004-09-08 19:48 3deda549.0
> >> -rw-r--r-- 1 mustafa mustafa   336 2004-10-07 14:10
> >> 3deda549.signing_policy mustafa@mustafa:
> >>
> >> Do we need to place these SDSC CA cert files anywhere else ?
> >>
> >> Mustafa
> >>
> >> On Thursday 06 September 2007 13:59:53 Dan Bradley wrote:
> >>> Your gridmanager log may contain clues about why the gahp failed to be
> >>> initialized.  One possible reason is that Condor is not finding the
> >>> necessary CA certs for your grid proxy.  This happens if you have your
> >>> trusted certs installed in a non-standard location, and Condor has not
> >>> been configured to know where to find them.
> >>>
> >>> --Dan
> >>>
> >>> Mustafa R Kilinc wrote:
> >>>> Trying to use condor_glidein and stuck with this problem.
> >>>> Any suggestions on how I can fix this:
> >>>>
> >>>> Please see the log below:
> >>>>
> >>>> mustafa@mustafa:~/condor$ grid-proxy-info
> >>>> subject  : /C=US/O=SDSC/OU=SDSC/CN=Mustafa
> >>>> Kilinc/UID=mkilinc/CN=1444578778 issuer   :
> >>>> /C=US/O=SDSC/OU=SDSC/CN=Mustafa Kilinc/UID=mkilinc
> >>>> identity : /C=US/O=SDSC/OU=SDSC/CN=Mustafa Kilinc/UID=mkilinc
> >>>> type     : Proxy draft (pre-RFC) compliant impersonation proxy
> >>>> strength : 512 bits
> >>>> path     : /tmp/x509up_u1000
> >>>> timeleft : 9:44:46
> >>>>
> >>>> mustafa@mustafa:~/condor$
> >>>> condor_glidein -arch=linux-sles8-ia64
> >>>> -setup_jobmanager=jobmanager-fork
> >>>> tg-login.sdsc.teragrid.org/jobmanager-pbs
> >>>>
> >>>> Running/verifying Glidein installation and setup...
> >>>> Submitting Glidein setup job...
> >>>> Error: the setup job has been put on hold!  Reason given:
> >>>> Failed to initialize GAHP
> >>>> Error: failed to run setup script on remote machine
> >>>>
> >>>> mustafa@mustafa:~/condor$ gt4_gahp
> >>>> $GahpVersion: 1.6.0 Dec 08 2006 GT4\ GAHP\ (GT-4.0.3) $
> >>>>
> >>>> mustafa@mustafa:~/condor$
> >>>>
> >>>> thanks,
> >>>> Mustafa
> >>>> _______________________________________________
> >>>> Condor-users mailing list
> >>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
> >>>> with a subject: Unsubscribe
> >>>> You can also unsubscribe by visiting
> >>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >>>>
> >>>> The archives can be found at:
> >>>> https://lists.cs.wisc.edu/archive/condor-users/
> >>>
> >>> _______________________________________________
> >>> Condor-users mailing list
> >>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
> >>> a subject: Unsubscribe
> >>> You can also unsubscribe by visiting
> >>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >>>
> >>> The archives can be found at:
> >>> https://lists.cs.wisc.edu/archive/condor-users/
> >>
> >> _______________________________________________
> >> Condor-users mailing list
> >> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
> >> a subject: Unsubscribe
> >> You can also unsubscribe by visiting
> >> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >>
> >> The archives can be found at:
> >> https://lists.cs.wisc.edu/archive/condor-users/
> >
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/condor-users/
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/