[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Condor-WS Job sumit issue " UNKOWNJOB"



Hi,

I am trying to test the Java Code posted by Jeff from IBM ("Manage grid
resources with Condor Web service"). All the functions work perfectly,
except for the Job Submittion with response from server "UNKOWNJOB"

my configuration:
Server:
     condor-6.8.6
     JDK1.6.0_03
     JRE1.6.0_03
Client:
     JDK1.5

while doing debug at the following statement:
      RequirementsAndStatus reqs_s = schedd.submit(transaction, clusterId,
jobId, jobAd);
jobAd shows perfect record of all the properties to me. I just cannot
monitor what the server receives :-(

May be Srinivasan had a similar problem in Febuary, I could not find out
how he solved his problem.
please help.

cheers,
David

--------condor_config-------
RELEASE_DIR             = /usr/local/condor-6.8.6
WEB_ROOT_DIR            = /usr/local/condor-6.8.6/web
ENABLE_SOAP             = TRUE
ENABLE_WEB_SERVER       = TRUE
ALLOW_SOAP              = */*
HOSTALLOW_READ = *
HOSTALLOW_WRITE = *

----------condor_q result------
-- Submitter: UGDExxx.xxx.xxx.CA : <142.104.61.85:9697> :
UGDEV01.phys.UVic.CA
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
 --- ???? ---
 --- ???? ---
  94.0   daobgong       11/9  10:44   0+00:00:19 R  0   9.8  test.sh
** job submitted through condor_submit shows normal status
** job submitted through condorWS java code shows unknown status
** unkown jobs can still deleted by clusterID and jobID if I know the IDs.


 -------------job_queue.log
***** the unkown status job always has very few entries
103 0.0 NextClusterNum 93
101 092.-1 Job Machine
101 92.0 Job Machine
103 92.0 GlobalJobId "UGDxxx.xxx.xxx.CA#1194629483#92.0"
106
105
******the normal one has many more entries
103 0.0 NextClusterNum 94
101 093.-1 Job Machine
101 93.0 Job Machine
103 93.0 GlobalJobId "UGDExxx.xxxx.xxx.CA#1194631899#93.0"
103 093.-1 ClusterId 93
103 093.-1 QDate 1194631899
103 093.-1 CompletionDate 0
103 093.-1 User "daobgong@xxxxxxxxxxxx ugdev01"
103 093.-1 Owner "daobgong"
103 093.-1 RemoteWallClockTime 0.000000
103 093.-1 LocalUserCpu 0.000000
103 093.-1 LocalSysCpu 0.000000
103 093.-1 RemoteUserCpu 0.000000
103 093.-1 RemoteSysCpu 0.000000

----------Scheduler Log
11/7 09:22:08 (pid:7536) About to serve HTTP request...
11/7 09:22:08 (pid:7536) Completed servicing HTTP request
11/7 09:22:08 (pid:7536) Received HTTP POST connection from
<142.xx.xx.xx:44580>
11/7 09:22:08 (pid:7536) About to serve HTTP request...
11/7 09:22:08 (pid:7536) Completed servicing HTTP request
11/7 09:22:21 (pid:7536) Job has no JobStatus attribute.  Ignoring...
                         ********************************************

11/7 09:22:21 (pid:7536) Job has no JobStatus attribute.  Ignoring...
11/7 09:22:21 (pid:7536) Job has no JobStatus attribute.  Ignoring...
11/7 09:22:21 (pid:7536) Job has no JobStatus attribute.  Ignoring...
11/7 09:22:21 (pid:7536) Job has no JobStatus attribute.  Ignoring...
11/7 09:22:21 (pid:7536) Job has no JobStatus attribute.  Ignoring...
11/7 09:22:21 (pid:7536) Job has no JobStatus attribute.  Ignoring...
11/7 09:22:21 (pid:7536) Job has no JobStatus attribute.  Ignoring...
11/7 09:22:21 (pid:7536) Sent ad to central manager for
daobgong@localdomain localhost
11/7 09:22:21 (pid:7536) Sent ad to 1 collectors for daobgong@localdomain
localhost
11/7 09:23:30 (pid:7536) Received HTTP POST connection from
<142.xx.xx.xx:40826>
11/7 09:23:30 (pid:7536) About to serve HTTP request...
11/7 09:23:30 (pid:7536) Timer 1095 not found
                         **********************

11/7 09:23:30 (pid:7536) Completed servicing HTTP request
11/7 09:27:21 (pid:7536) Job has no JobStatus attribute.  Ignoring...
11/7 09:27:21 (pid:7536) Job has no JobStatus attribute.  Ignoring...
11/7 09:27:21 (pid:7536) Job has no JobStatus attribute.  Ignoring...
11/7 09:27:21 (pid:7536) Job has no JobStatus attribute.  Ignoring...
11/7 09:27:21 (pid:7536) Job has no JobStatus attribute.  Ignoring...