[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] How to submit a job via SOAP API



Zhifeng,

Please try the "Tutorial: Developer APIs to Condor" from Condor Week 2006

http://www.cs.wisc.edu/condor/CondorWeek2006/presentations.html

Best,


matt

Zhifeng Yu wrote:
I am new to Condor. I have been able to successfully set up a
personal Condor (version 7.0.0), submit  and run some simple jobs of
Java and C program via command line. Then I attempted to submit jobs
via SOAP client written in Java by following the IBM tutorial
article. It seems condor received the job but always put the job on
"idle",

Here are the java code I used to submit a job:

files[0] = "/workspace/condor/jobs/submit.java"; WebServicesHelper.submitJobHelper(schedd, "aa0586",
UniverseType.JAVA, "java", "Simple 4 10", null,  files);

and submit.java is the file which works fine with command
"condor_submit submit.java", The content of the file is shown as
below:

Universe = java Executable = Simple.class Arguments = Simple 4 10 Log = simple.log Output = simple.out Error =
simple.error Queue

Can any one tell me how I should pass parameters to
WebServicesHelper.submitJobHelper()? I beleive this source code is
provided by Condor group with method sigature like:

public static void submitJobHelper(CondorScheddPortType schedd, String owner, UniverseType type, String cmd, String args, String requirements, String[] files) throws JobSubmissionException, SendFileException, java.io.IOException, java.rmi.RemoteException { }

I also provided the log file below for analysis.

Thanks and regards,

Zhifeng


-- Submitter: localhost.localdomain :  : localhost.localdomain ID
OWNER/NODENAME   SUBMITTED     RUN_TIME ST PRI SIZE CMD 9.0   aa0586
3/19 21:58   0+00:00:00 I  0   0.0  java Simple 4 10 1 jobs; 1 idle,
0 running, 0 held

Negotiator.log, it seems that negotiation is aborted in the middle
as,

3/19 22:05:33 ---------- Started Negotiation Cycle ---------- 3/19
22:05:33 Phase 1:  Obtaining ads from collector ... 3/19 22:05:33
Getting all public ads ... 3/19 22:05:33   Sorting 6 ads ... 3/19
22:05:33   Getting startd private ads ... 3/19 22:05:33 Got ads: 6
public and 2 private 3/19 22:05:33 Public ads include 1 submitter, 2
startd 3/19 22:05:33 Phase 2:  Performing accounting ... 3/19
22:05:33 Phase 3:  Sorting submitter ads by priority ... 3/19
22:05:33 Phase 4.1:  Negotiating with schedds ... 3/19 22:05:33
Negotiating with aa0586@localdomain at 3/19 22:05:33 0 seconds so far
 3/19 22:05:33     Request 00009.00000: 3/19 22:05:33       Matched
9.0 aa0586@localdomain preempting none slot1@xxxxxxxxxxxxxxxxxxxxx 3/19 22:05:33 Successfully matched with
slot1@xxxxxxxxxxxxxxxxxxxxx 3/19 22:05:33     Got NO_MORE_JOBS;  done
negotiating 3/19 22:05:33 ---------- Finished Negotiation Cycle
----------

And starter.log indicates signal error:

3/19 22:05:33 slot1: match_info called 3/19 22:05:33 slot1: Received
match #1205981602#4#... 3/19 22:05:33 slot1: State change: match
notification protocol successful 3/19 22:05:33 slot1: Changing state:
Unclaimed -> Matched 3/19 22:05:33 slot1: Request accepted. 3/19
22:05:33 slot1: Remote owner is aa0586@localdomain 3/19 22:05:33
slot1: State change: claiming protocol successful 3/19 22:05:33
slot1: Changing state: Matched -> Claimed 3/19 22:05:35 slot1: Got
activate_claim request from shadow () 3/19 22:05:36 slot1: Remote job
ID is 9.0 3/19 22:05:36 slot1: Got universe "JAVA" (10) from request
classad 3/19 22:05:36 slot1: State change: claim-activation protocol
successful 3/19 22:05:36 slot1: Changing activity: Idle -> Busy 3/19
22:05:36 slot1: Called deactivate_claim_forcibly() 3/19 22:05:36
attempt to connect to  failed: Connection refused (connect errno =
111). 3/19 22:05:36 Send_Signal: ERROR sending signal 3 (SIGQUIT) to
pid 3517 (still alive) 3/19 22:05:36 slot1: Error sending signal to
starter, errno = 25 (Inappropriate ioctl for device) 3/19 22:05:37
Starter pid 3517 exited with status 4 3/19 22:05:37 slot1: State
change: starter exited 3/19 22:05:37 slot1: Changing activity: Busy
-> Idle 3/19 22:05:37 slot1: State change: received RELEASE_CLAIM
command 3/19 22:05:37 slot1: Changing state and activity:
Claimed/Idle -> Preempting/Vacating 3/19 22:05:37 slot1: State
change: No preempting claim, returning to owner 3/19 22:05:37 slot1:
Changing state and activity: Preempting/Vacating -> Owner/Idle 3/19
22:05:37 slot1: State change: IS_OWNER is false 3/19 22:05:37 slot1:
Changing state: Owner -> Unclaimed


And shadow file looks like: 3/19 22:05:35
****************************************************** 3/19 22:05:35
** condor_shadow (CONDOR_SHADOW) STARTING UP 3/19 22:05:35 **
/usr/local/condor/sbin/condor_shadow 3/19 22:05:35 ** $CondorVersion:
7.0.0 Jan 22 2008 BuildID: 72173 $ 3/19 22:05:35 ** $CondorPlatform:
I386-LINUX_RHEL3 $ 3/19 22:05:35 ** PID = 3516 3/19 22:05:35 ** Log
last touched 3/19 21:55:41 3/19 22:05:35
****************************************************** 3/19 22:05:35
Using config source: /usr/local/condor/etc/condor_config 3/19
22:05:35 Using local config sources: 3/19 22:05:35
/home/aa0586/pool/condor_config.local 3/19 22:05:35 DaemonCore:
Command Socket at 3/19 22:05:35 Initializing a JAVA shadow for job
9.0 3/19 22:05:36 (9.0) (3516): Request to run on  was ACCEPTED 3/19
22:05:36 (9.0) (3516): ReliSock::put_file_with_permissions(): Failed
to stat file '/home/aa0586/pool/spool/cluster9.proc0.subproc0/java':
No such file or directory (errno: 2, si_error: 1) 3/19 22:05:36 (9.0)
(3516): DoUpload: (Condor error code 13, subcode 2) SHADOW at
192.168.0.20 failed to send file(s) to : error reading from
/home/aa0586/pool/spool/cluster9.proc0.subproc0/java: (errno 2) No
such file or directory; STARTER failed to receive file(s) from 3/19
22:05:36 (9.0) (3516): Job 9.0 going into Hold state (code 13,2):
Error from starter on slot1@xxxxxxxxxxxxxxxxxxxxx: STARTER failed to
receive file(s) from 3/19 22:05:36 (9.0) (3516): ZKM: setting default
map to (null) 3/19 22:05:36 (9.0) (3516): **** condor_shadow
(condor_SHADOW) EXITING WITH STATUS 112 _______________________________________________ Condor-users mailing
list To unsubscribe, send a message to
condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can
also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/