[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Problem when submitting to condor-g using condor 6.8.4 and gt4



 

Hello everyone,

We have recently begun installing grid tools on our laboratory clusters and we are having problems making Condor-G work with GT 4.0.

After installing Condor and GT 4.0 we have managed to submit and complete simple jobs with Condor and Globus separately and to Globus using Condor for managing our cluster resources resources. The problem arises when trying to submit a simple job to Condor-G.

Take for instance the example included in condor for checking the environment variables (env.cmd). We modified the submission file as follows:

####################

##

## Test Condor command file

##

####################

executable = env.remote

Universe = grid

grid_resource = gt4

192.168.0.252:8443/wsrf/services/ManagedJobFactoryService Condor

output = env.out

error = env.err

log = env.log

Args = "foo bar glarch"

environment = alpha=a;bravo=b;charlie=c

queue

 

but the job remains Idle indefinitely and the partial Gridmanager log output is:

5/17 16:16:08 Welcome to the all-singing, all dancing, "amazing"

GridManager!

5/17 16:16:08 [31892] Getting monitoring info for pid 31892

5/17 16:16:08 [31892] Checking proxies

5/17 16:16:09 [31892] DaemonCore: in SendAliveToParent()

5/17 16:16:09 [31892] DaemonCore: attempting to connect to '<192.168.0.252:41401>'

5/17 16:16:11 [31892] Received ADD_JOBS signal

5/17 16:16:11 [31892] in doContactSchedd()

5/17 16:16:11 [31892] querying for new jobs

5/17 16:16:11 [31892] Using constraint

((Owner=?="Panagiotis"&&JobUniverse==9)) && (Managed =!= "ScheddDone") && (((Matched =!= FALSE) && (JobStatus != 5)) || (Managed =?= "External"))

5/17 16:16:11 [31892] Using job type GT4 for job 99.0

5/17 16:16:11 [31892] (99.0) SetJobLeaseTimers()

5/17 16:16:11 [31892] Found job 99.0 --- inserting

5/17 16:16:11 [31892] Fetched 1 new job ads from schedd

5/17 16:16:11 [31892] querying for removed/held jobs

5/17 16:16:11 [31892] Using constraint

((Owner=?="Panagiotis"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?=

"External"))

5/17 16:16:11 [31892] Fetched 0 job ads from schedd

5/17 16:16:11 [31892] leaving doContactSchedd()

5/17 16:16:11 [31892] gahp server not up yet, delaying ping

5/17 16:16:11 [31892] *** UpdateLeases called

5/17 16:16:11 [31892] Leases not supported, cancelling timer

5/17 16:16:11 [31892] *** checkDelegation()

5/17 16:16:11 [31892] gahp server not up yet, delaying checkDelegation

5/17 16:16:11 [31892] (99.0) doEvaluateState called: gmState GM_INIT, globusState 32

5/17 16:16:11 [31892] GAHP server pid = 31893

5/17 16:16:11 [31892] Failed to read GAHP server version

5/17 16:16:11 [31892] (99.0) Error initializing GAHP

5/17 16:16:11 [31892] (99.0) gm state change: GM_INIT -> GM_HOLD

5/17 16:16:11 [31892] (99.0) Writing hold record to user logfile

5/17 16:16:11 [31892] (99.0) gm state change: GM_HOLD -> GM_DELETE

5/17 16:16:11 [31892] DaemonCore: No more children processes to reap.

5/17 16:16:11 [31892] ERROR "Gahp Server (pid=31893) exited with status 255 " at line 278 in file gahp-client.C

we have checked the JAVA configuration as two other similar posts suggested but as far as we can see everything is in order. Any ideas?

 

Thank you in advance,

 

Giannis Kampolis