[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor and globus webservices

On Wed, 2005-03-30 at 12:02, Erik Paulson wrote:
> On Wed, Mar 30, 2005 at 11:56:37AM -0500, Murali Ramsunder wrote:
> > Hi,
> > 
> > I have Condor installed from vdt-1.3.2 and I'm trying to submit to
> > another machine that runs GT-3.2 WS (webservices). managed-job-factory
> > submits to the other machine works, however, when I use Condor it fails.
> > Has anyone tried this? Any help appreciated.
> > 
> What does the Gridmanager log say?
> -Erik

Hi Erik,

I hope this captures the essense of the Gridmanager log for the run.
And, thanks for posting the message to the list.


3/30 10:39:37 [27954] Received ADD_JOBS signal
3/30 10:39:37 [27954] in doContactSchedd()
3/30 10:39:37 [27954] GRIDMANAGER_TIMEOUT_MULTIPLIER is undefined, using
default value of 0
3/30 10:39:37 [27954] AUTHENTICATE_FS: used file /tmp/qmgr_5Z6Zzx,
status: 1
3/30 10:39:37 [27954] querying for new jobs
3/30 10:39:37 [27954] Using constraint
((Owner=?="murali"&&JobUniverse==9)) && ((Matched =!= FALSE && JobStatus
!= 5) || Managed =?= TRUE)
3/30 10:39:37 [27954] ***Trying job type Mirror
3/30 10:39:37 [27954] ***Trying job type INFNBatch
3/30 10:39:37 [27954] ***Trying job type Condor
3/30 10:39:37 [27954] ***Trying job type GT3
3/30 10:39:37 [27954] Using job type GT3 for job 132.0
3/30 10:39:37 [27954] GRIDMANAGER_MAX_PENDING_REQUESTS is undefined,
using default value of 50
3/30 10:39:37 [27954] GRIDMANAGER_MAX_PENDING_REQUESTS is undefined,
using default value of 50
3/30 10:39:37 [27954] Found job 132.0 --- inserting
3/30 10:39:37 [27954] Fetched 1 new job ads from schedd
3/30 10:39:37 [27954] querying for removed/held jobs
3/30 10:39:37 [27954] Using constraint
((Owner=?="murali"&&JobUniverse==9)) && (JobStatus == 3 || JobStatus ==
4 || (JobStatus == 5 && Managed =?= TRUE))
3/30 10:39:37 [27954] Fetched 0 job ads from schedd
3/30 10:39:37 [27954] leaving doContactSchedd()
3/30 10:39:37 [27954] (132.0) doEvaluateState called: gmState GM_INIT,
globusState 32
3/30 10:39:37 [27954] GAHP server pid = 27955
3/30 10:39:37 [27954] GAHP server version: $GahpVersion: 1.1.0 Apr 26
2004 GT3.2 GAHP v0.0 (alpha) $
3/30 10:39:37 [27954] GAHP[27955] <- 'COMMANDS'
3/30 10:39:37 [27954] GAHP[27955] -> 'S' 'ASYNC_MODE_OFF'
3/30 10:39:37 [27954] GAHP[27955] <- 'ASYNC_MODE_ON'
3/30 10:39:38 [27954] GAHP[27955] -> 'S'
3/30 10:39:38 [27954] GAHP[27955] <- 'INITIALIZE_FROM_FILE
3/30 10:39:38 [27954] GAHP[27955] -> 'S'
3/30 10:39:38 [27954] GAHP[27955] <- 'CACHE_PROXY_FROM_FILE 2
3/30 10:39:38 [27954] GAHP[27955] -> 'S'
3/30 10:39:38 [27954] GAHP[27955] <- 'CACHE_PROXY_FROM_FILE 1
3/30 10:39:38 [27954] GAHP[27955] -> 'S'
3/30 10:39:38 [27954] GAHP[27955] <- 'GT3_GRAM_CALLBACK_ALLOW 2 0'
3/30 10:39:40 [27954] GAHP[27955] -> 'S' '1'
3/30 10:39:40 [27954] GAHP[27955] <- 'GASS_SERVER_INIT 3 0'
3/30 10:39:40 [27954] GAHP[27955] -> 'S'
3/30 10:39:40 [27954] GAHP[27955] <- 'RESULTS'
3/30 10:39:40 [27954] GAHP[27955] -> 'S' '0'
3/30 10:39:41 [27954] GAHP[27955] <- 'RESULTS'
3/30 10:39:41 [27954] GAHP[27955] -> 'S' '1'
3/30 10:39:41 [27954] GAHP[27955] -> '3' '0' 'https://IPaddr:53590'
3/30 10:39:41 [27954] (132.0) gm state change: GM_INIT -> GM_START
3/30 10:39:41 [27954] (132.0) gm state change: GM_START ->
3/30 10:39:41 [27954] (132.0) gm state change: GM_CLEAR_REQUEST ->
3/30 10:39:41 [27954] (132.0) gm state change: GM_UNSUBMITTED ->
3/30 10:39:41 [27954] GAHP[27955] <- 'USE_CACHED_PROXY 1'
3/30 10:39:41 [27954] GAHP[27955] -> 'S'
3/30 10:39:41 [27954] GAHP[27955] <- 'GT3_GRAM_JOB_CREATE 4
http://hostname:8080/ogsa/services/base/gram/MasterForkManagedJobFactoryService 1 &(rsl_substitution=(GRIDMANAGER_GASS_URL\ https://IPaddr:53590))(executable=$(GRIDMANAGER_GASS_URL)#'//bin/hostname')(scratchdir='')(directory=$(SCRATCH_DIRECTORY))(stdout=$(GRIDMANAGER_GASS_URL)#'//usr1/home/murali/1072/out')(stderr=$(GRIDMANAGER_GASS_URL)#'//usr1/home/murali/1072/err')(proxy_timeout=240)(remote_io_url=$(GRIDMANAGER_GASS_URL))'
3/30 10:39:41 [27954] GAHP[27955] -> 'S'
3/30 10:39:41 [27954] gahp server not up yet, delaying ping
3/30 10:39:41 [27954] Error from GAHP[27955]:
0    [Thread-0] DEBUG org.globus.ogsa.config.ContainerConfig  - trying
to load file: client-server-config.wsdd
3    [Thread-0] DEBUG org.globus.ogsa.config.ContainerConfig  - loading
configuration from file: client-server-config.wsdd
1253 [main] DEBUG org.globus.ogsa.server.ServiceContainer  -
3/30 10:39:41 [27954] (132.0) doEvaluateState called: gmState GM_SUBMIT,
globusState 32
3/30 10:39:41 [27954] GAHP[27955] <- 'RESULTS'
3/30 10:39:41 [27954] GAHP[27955] -> 'S' '0'
3/30 10:39:43 [27954] Error from GAHP[27955]:
java.lang.IllegalAccessError: tried to access field
org.apache.xpath.compiler.FunctionTable.m_functions from class
3/30 10:39:43 [27954] Error from GAHP[27955]:
        at org.apache.xml.security.Init.init(Unknown Source)-----
3/30 10:39:43 [27954] Error from GAHP[27955]:
3/30 10:39:43 [27954] Error from GAHP[27955]:
3/30 10:39:43 [27954] Error from GAHP[27955]:
3/30 10:39:43 [27954] Error from GAHP[27955]:
3/30 10:39:43 [27954] Error from GAHP[27955]:
        at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:150)
        at org.apache.axis.SimpleChain.invoke(SimpleChain.java:120)-----
3/30 10:39:43 [27954] Error from GAHP[27955]:
        at org.apache.axis.client.AxisClient.invoke(AxisClient.java:167)
3/30 10:39:43 [27954] Error from GAHP[27955]:
        at org.apache.axis.client.Call.invokeEngine(Call.java:2564)
        at org.apache.axis.client.Call.invoke(Call.java:2553)
3/30 10:39:43 [27954] Error from GAHP[27955]:
        at org.apache.axis.client.Call.invoke(Call.java:2248)
        at org.apache.axis.client.Call.invoke(Call.java:2171)-----
3/30 10:39:43 [27954] Error from GAHP[27955]:
        at org.apache.axis.client.Call.invoke(Call.java:1691)
3/30 10:39:43 [27954] Error from GAHP[27955]:
3/30 10:39:43 [27954] Error from GAHP[27955]:
3/30 10:39:43 [27954] Error from GAHP[27955]:
3/30 10:39:43 [27954] Error from GAHP[27955]:
        at java.lang.Thread.run(Thread.java:534)
3/30 10:39:46 [27954] gahp server not up yet, delaying ping
3/30 10:39:49 [27954] DaemonCore::IsPidAlive(): kill returned EPERM,
assuming pid 9125 is alive.
3/30 10:39:51 [27954] gahp server not up yet, delaying ping
3/30 10:39:56 [27954] gahp server not up yet, delaying ping
3/30 10:40:01 [27954] gahp server not up yet, delaying ping
3/30 10:40:06 [27954] gahp server not up yet, delaying ping
3/30 10:40:07 [27954] (132.0) Evaluating periodic job policy expressions
3/30 10:40:11 [27954] gahp server not up yet, delaying ping
3/30 10:40:16 [27954] gahp server not up yet, delaying ping
3/30 10:40:21 [27954] gahp server not up yet, delaying ping
3/30 10:40:26 [27954] gahp server not up yet, delaying ping
3/30 10:40:31 [27954] gahp server not up yet, delaying ping
3/30 10:40:36 [27954] gahp server not up yet, delaying ping
3/30 10:40:37 [27954] (132.0) Evaluating periodic job policy expressions
3/30 10:40:38 [27954] GAHP[27955] <- 'RESULTS'
3/30 10:40:38 [27954] GAHP[27955] -> 'S' '0'

3/30 10:44:41 [27954] gahp server not up yet, delaying ping
3/30 10:44:42 [27954] (132.0) doEvaluateState called: gmState GM_SUBMIT,
globusState 32
3/30 10:44:42 [27954] (132.0) gmState GM_SUBMIT, globusState 32:
globus_gram_client_job_create() returned Globus error -103
3/30 10:44:42 [27954] (132.0)   
3/30 10:44:42 [27954] (132.0) Writing submit-failed record to user
3/30 10:44:42 [27954] (132.0) gm state change: GM_SUBMIT ->
3/30 10:44:42 [27954] (132.0) gm state change: GM_UNSUBMITTED ->
3/30 10:44:42 [27954] (132.0) gm state change: GM_SUBMIT -> GM_HOLD
3/30 10:44:42 [27954] (132.0) Writing hold record to user logfile
3/30 10:44:42 [27954] (132.0) gm state change: GM_HOLD -> GM_DELETE
3/30 10:44:42 [27954] in doContactSchedd()
3/30 10:44:42 [27954] GRIDMANAGER_TIMEOUT_MULTIPLIER is undefined, using
default value of 0
3/30 10:44:42 [27954] SEC_DEBUG_PRINT_KEYS is undefined, using default
value of
3/30 10:44:42 [27954] AUTHENTICATE_FS: used file /tmp/qmgr_kFEdOY,
status: 1
3/30 10:44:42 [27954] querying for removed/held jobs
3/30 10:44:42 [27954] Using constraint
((Owner=?="murali"&&JobUniverse==9)) && (JobStatus == 3 || JobStatus ==
4 || (JobStatus == 5 && Managed =?= TRUE))
3/30 10:44:42 [27954] Fetched 0 job ads from schedd
3/30 10:44:42 [27954] Updating classad values for 132.0:
3/30 10:44:42 [27954]    JobStatus = 5
3/30 10:44:42 [27954]    EnteredCurrentStatus = 1112197482
3/30 10:44:42 [27954]    HoldReason = "Attempts to submit failed"
3/30 10:44:42 [27954]    HoldReasonCode = 0
3/30 10:44:42 [27954]    HoldReasonSubCode = 0
3/30 10:44:42 [27954]    ReleaseReason = UNDEFINED
3/30 10:44:42 [27954]    NumSystemHolds = 1
3/30 10:44:42 [27954]    Managed = FALSE
3/30 10:44:42 [27954] GAHP[27955] <- 'UNCACHE_PROXY 1'
3/30 10:44:42 [27954] GAHP[27955] -> 'S'
3/30 10:44:42 [27954] GAHP[27955] <- 'USE_CACHED_PROXY 2'
3/30 10:44:42 [27954] GAHP[27955] -> 'S'
3/30 10:44:42 [27954] No jobs left, shutting down
3/30 10:44:42 [27954] leaving doContactSchedd()
3/30 10:44:42 [27954] Got SIGTERM. Performing graceful shutdown.
3/30 10:44:42 [27954] Started timer to call main_shutdown_fast in 1800
3/30 10:44:42 [27954] **** condor_gridmanager (condor_GRIDMANAGER)