[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] How to submit a job via SOAP API



Hi Zhifeng,

You also need to pass in Condor's internal names instead of those in
the file descriptors. In particular, you need to use UserLog, Out, and
Err for the ones you wanted.

I recommend you check out:
http://www.google.com/search?hl=en&q=condor+birdbath+site%3Acommunitygrids.blogspot.com&btnG=Search

In particular http://communitygrids.blogspot.com/2007/12/file-retrieval-with-birdbath-and-condor.html

The author of this blog has made some very interesting notes on using
the Condor SOAP API.

Cheers,

Matias


On Mon, Apr 7, 2008 at 9:54 AM, Matthew Farrellee <matt@xxxxxxxxxxx> wrote:
> You need to add those to the job ad before submission.
>
>
>  Best,
>
>
>  matt
>
>  Zhifeng Yu wrote:
>  > Thanks Matt, I made some progress after reading the references.
>  >
>  > Assuming the submission file is:
>  >
>  > Universe = java
>  > Executable = Simple.class
>  > Arguments = Simple 4 10  Log = simple.log
>  > Output = simple.out
>  > Error = simple.error
>  > Requirement = OpSys=="LINUX"
>  > Queue
>  > How do you map these parameters to the web service method call?And the web service method call is:
>  >    /**     * Service definition of function condor__createJobTemplate     */    public condor.ClassAdStructAndStatus createJobTemplate(int clusterId, int jobId, java.lang.String owner, condor.UniverseType type, java.lang.String cmd, java.lang.String args, java.lang.String requirements) throws java.rmi.RemoteException;
>  >
>  > I figured out I can call this way createJobTemplate(clusterId, jobId, "owner", Universe.java, "Simple.class", "4 10", "Opsys=\"LINUX\"").
>  >
>  > But how to define " Log = simple.log Output = simple.out Error = simple.error" in the method call. Or there is any other place to specify these parameters?
>  >
>  > Thanks and regards,
>  >
>  > Zhifeng
>  >
>  >
>  >
>  >
>  >
>  >> Date: Fri, 28 Mar 2008 17:19:53 -0500> From: matt@xxxxxxxxxxx> To: condor-users@xxxxxxxxxxx> Subject: Re: [Condor-users] How to submit a job via SOAP API> > Zhifeng,> > Please try the "Tutorial: Developer APIs to Condor" from Condor Week 2006> > http://www.cs.wisc.edu/condor/CondorWeek2006/presentations.html> > Best,> > > matt> > Zhifeng Yu wrote:> > I am new to Condor. I have been able to successfully set up a> > personal Condor (version 7.0.0), submit and run some simple jobs of> > Java and C program via command line. Then I attempted to submit jobs> > via SOAP client written in Java by following the IBM tutorial> > article. It seems condor received the job but always put the job on> > "idle",> > > > Here are the java code I used to submit a job:> > > > files[0] = "/workspace/condor/jobs/submit.java"; > > WebServicesHelper.submitJobHelper(schedd, "aa0586",> > UniverseType.JAVA, "java", "Simple 4 10", null, files);> > > > and submit.java is the file which works fine w!
 ith
>  command> > "condor_submit submit.java", The content of the file is shown as> > below:> > > > Universe = java Executable = Simple.class Arguments = Simple 4 10 > > Log = simple.log Output = simple.out Error => > simple.error Queue> > > > Can any one tell me how I should pass parameters to> > WebServicesHelper.submitJobHelper()? I beleive this source code is> > provided by Condor group with method sigature like:> > > > public static void submitJobHelper(CondorScheddPortType schedd, > > String owner, UniverseType type, String cmd, String args, String> > requirements, String[] files) throws JobSubmissionException, > > SendFileException, java.io.IOException, java.rmi.RemoteException { }> > > > I also provided the log file below for analysis.> > > > Thanks and regards,> > > > Zhifeng> > > > > > -- Submitter: localhost.localdomain : : localhost.localdomain ID> > OWNER/NODENAME SUBMITTED RUN_TIME ST PRI SIZE CMD 9.0 aa0586> > 3/19 21:58 0+00:00:00 I 0 0.0 java Simple 4 10 1 jobs;!
  1 i
>  dle,> > 0 running, 0 held> > > > Negotiator.log, it seems that negotiation is aborted in the middle> > as,> > > > 3/19 22:05:33 ---------- Started Negotiation Cycle ---------- 3/19> > 22:05:33 Phase 1: Obtaining ads from collector ... 3/19 22:05:33> > Getting all public ads ... 3/19 22:05:33 Sorting 6 ads ... 3/19> > 22:05:33 Getting startd private ads ... 3/19 22:05:33 Got ads: 6> > public and 2 private 3/19 22:05:33 Public ads include 1 submitter, 2> > startd 3/19 22:05:33 Phase 2: Performing accounting ... 3/19> > 22:05:33 Phase 3: Sorting submitter ads by priority ... 3/19> > 22:05:33 Phase 4.1: Negotiating with schedds ... 3/19 22:05:33> > Negotiating with aa0586@localdomain at 3/19 22:05:33 0 seconds so far> > 3/19 22:05:33 Request 00009.00000: 3/19 22:05:33 Matched> > 9.0 aa0586@localdomain preempting none slot1@xxxxxxxxxxxxxxxxxxxxx > > 3/19 22:05:33 Successfully matched with> > slot1@xxxxxxxxxxxxxxxxxxxxx 3/19 22:05:33 Got NO_MORE_JOBS; done> > negotiating 3/19 2!
 2:05
>
> :33 ---------- Finished Negotiation Cycle> > ----------> > > > And starter.log indicates signal error:> > > > 3/19 22:05:33 slot1: match_info called 3/19 22:05:33 slot1: Received> > match #1205981602#4#... 3/19 22:05:33 slot1: State change: match> > notification protocol successful 3/19 22:05:33 slot1: Changing state:> > Unclaimed -> Matched 3/19 22:05:33 slot1: Request accepted. 3/19> > 22:05:33 slot1: Remote owner is aa0586@localdomain 3/19 22:05:33> > slot1: State change: claiming protocol successful 3/19 22:05:33> > slot1: Changing state: Matched -> Claimed 3/19 22:05:35 slot1: Got> > activate_claim request from shadow () 3/19 22:05:36 slot1: Remote job> > ID is 9.0 3/19 22:05:36 slot1: Got universe "JAVA" (10) from request> > classad 3/19 22:05:36 slot1: State change: claim-activation protocol> > successful 3/19 22:05:36 slot1: Changing activity: Idle -> Busy 3/19> > 22:05:36 slot1: Called deactivate_claim_forcibly() 3/19 22:05:36> > attempt to connect to failed: Conn!
 ect
>  ion refused (connect errno => > 111). 3/19 22:05:36 Send_Signal: ERROR sending signal 3 (SIGQUIT) to> > pid 3517 (still alive) 3/19 22:05:36 slot1: Error sending signal to> > starter, errno = 25 (Inappropriate ioctl for device) 3/19 22:05:37> > Starter pid 3517 exited with status 4 3/19 22:05:37 slot1: State> > change: starter exited 3/19 22:05:37 slot1: Changing activity: Busy> > -> Idle 3/19 22:05:37 slot1: State change: received RELEASE_CLAIM> > command 3/19 22:05:37 slot1: Changing state and activity:> > Claimed/Idle -> Preempting/Vacating 3/19 22:05:37 slot1: State> > change: No preempting claim, returning to owner 3/19 22:05:37 slot1:> > Changing state and activity: Preempting/Vacating -> Owner/Idle 3/19> > 22:05:37 slot1: State change: IS_OWNER is false 3/19 22:05:37 slot1:> > Changing state: Owner -> Unclaimed> > > > > > And shadow file looks like: 3/19 22:05:35> > ****************************************************** 3/19 22:05:35> > ** condor_shadow (CONDOR_SHA!
 DOW)
>   STARTING UP 3/19 22:05:35 **> > /usr/local/condor/sbin/condor_shadow 3/19 22:05:35 ** $CondorVersion:> > 7.0.0 Jan 22 2008 BuildID: 72173 $ 3/19 22:05:35 ** $CondorPlatform:> > I386-LINUX_RHEL3 $ 3/19 22:05:35 ** PID = 3516 3/19 22:05:35 ** Log> > last touched 3/19 21:55:41 3/19 22:05:35> > ****************************************************** 3/19 22:05:35> > Using config source: /usr/local/condor/etc/condor_config 3/19> > 22:05:35 Using local config sources: 3/19 22:05:35> > /home/aa0586/pool/condor_config.local 3/19 22:05:35 DaemonCore:> > Command Socket at 3/19 22:05:35 Initializing a JAVA shadow for job> > 9.0 3/19 22:05:36 (9.0) (3516): Request to run on was ACCEPTED 3/19> > 22:05:36 (9.0) (3516): ReliSock::put_file_with_permissions(): Failed> > to stat file '/home/aa0586/pool/spool/cluster9.proc0.subproc0/java':> > No such file or directory (errno: 2, si_error: 1) 3/19 22:05:36 (9.0)> > (3516): DoUpload: (Condor error code 13, subcode 2) SHADOW at> > 192.168.0.20!
  fai
>  led to send file(s) to : error reading from> > /home/aa0586/pool/spool/cluster9.proc0.subproc0/java: (errno 2) No> > such file or directory; STARTER failed to receive file(s) from 3/19> > 22:05:36 (9.0) (3516): Job 9.0 going into Hold state (code 13,2):> > Error from starter on slot1@xxxxxxxxxxxxxxxxxxxxx: STARTER failed to> > receive file(s) from 3/19 22:05:36 (9.0) (3516): ZKM: setting default> > map to (null) 3/19 22:05:36 (9.0) (3516): **** condor_shadow> > (condor_SHADOW) EXITING WITH STATUS 112 > > _______________________________________________ Condor-users mailing> > list To unsubscribe, send a message to> > condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can> > also unsubscribe by visiting > > https://lists.cs.wisc.edu/mailman/listinfo/condor-users> > > > The archives can be found at: > > https://lists.cs.wisc.edu/archive/condor-users/> _______________________________________________> Condor-users mailing list> To unsubscribe, send a message to c!
 ondo
>  r-users-request@xxxxxxxxxxx with a> subject: Unsubscribe> You can also unsubscribe by visiting> https://lists.cs.wisc.edu/mailman/listinfo/condor-users> > The archives can be found at: > https://lists.cs.wisc.edu/archive/condor-users/
>  >>
>  >> ------------------------------------------------------------------------
>  >>
>  >> _______________________________________________
>  >> Condor-users mailing list
>  >> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>  >> subject: Unsubscribe
>  >> You can also unsubscribe by visiting
>  >> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>  >>
>  >> The archives can be found at:
>  >> https://lists.cs.wisc.edu/archive/condor-users/
>  _______________________________________________
>
>
> Condor-users mailing list
>  To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>  subject: Unsubscribe
>  You can also unsubscribe by visiting
>  https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>  The archives can be found at:
>  https://lists.cs.wisc.edu/archive/condor-users/
>