[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor-G error



I did the steps of setting the GAHP var and the test worked with the ouput:
$GahpVersion: 1.0.15 Oct 12 2006 UW\ Gahp $

But now I am getting the error "ERROR "Gahp Server (pid=17020) exited with status 255
" at line 278 in file gahp-client.C" . 

What should I do next.

thanks for the help so far.

________________________________________________________________________________________________________
1/25 01:26:49 ******************************************************
1/25 01:26:49 ** condor_gridmanager (CONDOR_GRIDMANAGER) STARTING UP
1/25 01:26:49 ** /usr/local/condor/sbin/condor_gridmanager
1/25 01:26:49 ** $CondorVersion: 6.8.2 Oct 12 2006 $
1/25 01:26:49 ** $CondorPlatform: I386-LINUX_RHEL3 $
1/25 01:26:49 ** PID = 17019
1/25 01:26:49 ** Log last touched 1/25 01:21:50
1/25 01:26:49 ******************************************************
1/25 01:26:49 Using config source: /usr/local/condor/etc/condor_config
1/25 01:26:49 Using local config sources:
1/25 01:26:49    /home/condor/condor_config.local
1/25 01:26:49 DaemonCore: Command Socket at < 152.15.98.25:39644>
1/25 01:26:52 [17019] DaemonCore: Command received via UDP from host <152.15.98.25:59947>
1/25 01:26:52 [17019] DaemonCore: received command 60000 (DC_RAISESIGNAL), calling handler (HandleSigCommand())
1/25 01:26:52 [17019] Found job 227.0 --- inserting
1/25 01:26:52 [17019] gahp server not up yet, delaying ping
1/25 01:26:52 [17019] gahp server not up yet, delaying checkDelegation
1/25 01:26:52 [17019] (227.0 ) doEvaluateState called: gmState GM_INIT, globusState 32
1/25 01:26:52 [17019] GAHP server pid = 17020
1/25 01:26:52 [17019] Failed to read GAHP server version
1/25 01:26:52 [17019] (227.0) Error initializing GAHP
1/25 01:26:52 [17019] ERROR "Gahp Server (pid=17020) exited with status 255
" at line 278 in file gahp-client.C
________________________________________________________________________________________________________

On 1/24/07, Gabriel Mateescu <gabriel.mateescu@xxxxxx> wrote:
See

https://lists.cs.wisc.edu/archive/condor-users/2005-October/msg00057.shtml

On Wed, 2007-01-24 at 17:14, Jeremy Villalobos wrote:
> here is what I see in that log
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 1/24 17:05:51 ******************************************************
> 1/24 17:05:51 Using config source: /usr/local/condor/etc/condor_config
> 1/24 17:05:51 Using local config sources:
> 1/24 17:05:51    /home/condor/condor_config.local
> 1/24 17:05:51 DaemonCore: Command Socket at <152.15.98.25:53625>
> 1/24 17:05:54 [13166] DaemonCore: Command received via UDP from host
> < 152.15.98.25:58894>
> 1/24 17:05:54 [13166] DaemonCore: received command 60000
> (DC_RAISESIGNAL), calling handler (HandleSigCommand())
> 1/24 17:05:54 [13166] Found job 218.0 --- inserting
> 1/24 17:05:54 [13166] gahp server not up yet, delaying ping
> 1/24 17:05:54 [13166] gahp server not up yet, delaying checkDelegation
> 1/24 17:05:54 [13166] (218.0) doEvaluateState called: gmState GM_INIT,
> globusState 32
> 1/24 17:05:54 [13166] GAHP server pid = 13167
> 1/24 17:05:54 [13166] Failed to read GAHP server version
> 1/24 17:05:54 [13166] (218.0) Error initializing GAHP
> 1/24 17:05:54 [13166] ERROR "Gahp Server (pid=13167) exited with
> status 255
> " at line 278 in file gahp-client.C
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> what is a Gahp server ?
> I read on the documentation that there is virtually no changes on the
> condor configuration to get condor-g to work, is this right ?
>
> On 1/24/07, Gabriel Mateescu < gabriel.mateescu@xxxxxx> wrote:
>         ... and, as Todd said, use
>
>           universe = grid
>
>         Gabriel
>
>         On Wed, 2007-01-24 at 16:17, Gabriel Mateescu wrote:
>         > Look in the Grid Manager log file,
>         >
>         >  /tmp/GridmanagerLog.`id  -un`
>         >
>         > and see if there is something about  job 218.
>         >
>         > Gabriel
>         >
>         >
>         > On Wed, 2007-01-24 at 16:06, Jeremy Villalobos wrote:
>         > > thanks, that was part of the problem.
>         > > My next problem is that I get the following output from
>         the queue
>         > >
>         > > -- Submitter: coit-grid02.uncc.edu : <152.15.98.25:41642>
>         :
>         > > coit-grid02.uncc.edu
>         > >  ID      OWNER            SUBMITTED     RUN_TIME ST PRI
>         SIZE CMD
>         > >  218.0   jfvillal        1/24 13:50   0+00:00:00 I  0
>         9.8   echo
>         > > hello
>         > >  219.0    jfvillal        1/24 13:50   0+02:13:17 R  0
>         0.0
>         > > gridftp_wrapper.sh
>         > >
>         > > gridftp_wrapper.sh runs for ever and "echo hello" never
>         gets out of I
>         > > status.
>         > >
>         > > the revised test.con script is:
>         > >
>         > > universe = globus
>         > > grid_resource = gt4
>         > >
>         https://coit-grid02.uncc.edu:8440/wsrf/service/ManagedJobFactoryService Fork
>         > > Executable = /bin/echo
>         > > Arguments = hello
>         > > Log = simple.log
>         > > Output = simple.out
>         > > Error = simple.error
>         > > Queue
>         > >
>         > > I have tryed "globus" and "grid" universe, which one is
>         the most up to
>         > > date universe ?
>         > >
>         > > thanks for the help so far.
>         > >
>         > >
>         > > On 1/24/07, Gabriel Mateescu <gabriel.mateescu@xxxxxx >
>         wrote:
>         > >         Hello,
>         > >
>         > >         You have misspelled grid_resource.
>         > >
>         > >         Gabriel
>         > >
>         > >
>         > >         On Sat, 2007-01-20 at 20:38, Jeremy Villalobos
>         wrote:
>         > >         > nope, the error keep showin up
>         > >         >
>         > >         >
>         > >         > On 1/20/07, Todd Tannenbaum <
>         tannenba@xxxxxxxxxxx> wrote:
>         > >         >         Re the below...
>         > >         >
>         > >         >         Just a wild guess, but change the line
>         > >         >            universe = globus
>         > >         >         to be instead
>         > >         >            universe = grid
>         > >         >
>         > >         >         IIRC, globus universe is deprecated
>         syntax and is
>         > >         hard-wired
>         > >         >         to be a gt2 job for backwards
>         compatibility.
>         > >         >
>         > >         >         Regards
>         > >         >         Todd
>         > >         >
>         > >         >         ---
>         > >         >         Todd Tannenbaum
>         > >         >         Dept of Computer Sciences
>         > >         >         University of Wisconsin-Madison
>         > >         >         <-- Sent from a Palm Treo 680 -->
>         > >         >
>         > >         >         -----Original Message-----
>         > >         >
>         > >         >         From:  "Jeremy Villalobos" <
>         > >         jeremyvillalobos@xxxxxxxxx>
>         > >         >         Subj:  [Condor-users] Condor-G error
>         > >         >         Date:  Sat Jan 20, 2007 4:20 pm
>         > >         >         Size:  872 bytes
>         > >         >         To:  "Condor-Users Mail List" <
>         > >         condor-users@xxxxxxxxxxx>
>         > >         >
>         > >         >         Hello:
>         > >         >         I am trying to submit a job with
>         condor-g to the
>         > >         globus
>         > >         >         universe with the following script
>         > >         >
>         > >         >         universe = globus
>         > >         >         grid_resoure = gt4
>         > >         >
>         > >
>         https://coit-grid02.uncc.edu:8440/wsrf/services/ManagedJobFactoryService fork
>         > >         >         Executable = /bin/echo
>         > >         >         Arguments = hello
>         > >         >         Log = simple.log
>         > >         >         Output = simple.out
>         > >         >         Error = simple.error
>         > >         >         Queue
>         > >         >
>         > >         >         But I am getting the error:
>         > >         >
>         > >         >         Submitting job(s)
>         > >         >         ERROR: No resource identifier was found.
>         > >         >
>         > >         >         What does it mean ?
>         > >         >
>         > >         >         I have all the globus system configured
>         and working
>         > >         by itself.
>         > >         >         and condor is configured and working by
>         itself. But
>         > >         the
>         > >         >         configuration to get both of then to
>         work is given
>         > >         the error
>         > >         >         above
>         > >         >
>         > >         >         Thanks for any help
>         > >         >
>         > >         >           --- message truncated ---
>         > >         >
>         > >         >
>         _______________________________________________
>         > >         >         Condor-users mailing list
>         > >         >         To unsubscribe, send a message to
>         > >         >         condor-users-request@xxxxxxxxxxx with a
>         > >         >         subject: Unsubscribe
>         > >         >         You can also unsubscribe by visiting
>         > >         >
>         > >
>         https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>         > >         >
>         > >         >         The archives can be found at either
>         > >         >
>         https://lists.cs.wisc.edu/archive/condor-users/
>         > >         >
>         > >
>         http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
>         > >         >
>         > >         >
>         > >         >
>         > >         >
>         > >
>         ______________________________________________________________________
>         > >         > _______________________________________________
>         > >         > Condor-users mailing list
>         > >         > To unsubscribe, send a message to
>         > >         condor-users-request@xxxxxxxxxxx with a
>         > >         > subject: Unsubscribe
>         > >         > You can also unsubscribe by visiting
>         > >         >
>         https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>         > >         >
>         > >         > The archives can be found at either
>         > >         > https://lists.cs.wisc.edu/archive/condor-users/
>         > >         >
>         > >
>         http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
>         > >
>         > >         _______________________________________________
>         > >         Condor-users mailing list
>         > >         To unsubscribe, send a message to
>         > >         condor-users-request@xxxxxxxxxxx with a
>         > >         subject: Unsubscribe
>         > >         You can also unsubscribe by visiting
>         > >
>         https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>         > >
>         > >         The archives can be found at either
>         > >         https://lists.cs.wisc.edu/archive/condor-users/
>         > >
>         http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
>         > >
>
>