[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] How to submit a job via SOAP API




Thanks for Matias and Matt.
 
I made it kind of work now after learning how to add the attributes from the "communitygrids" blog. However, I got another issue. The job is not executed right away and I am very sure I did commit transaction at the end. The job sits on "idle" status for a long period (about 20 minutes) before start running. If I issue a command "condor_reschedule", the job gets executed right away. Same job works fine with "condor_submit" command. But in the web service call, I do specify transferring the executable java class file even it is running on same machine. The file is very small though, it is the sample "Simple.class".
 
Anything else I should do? I use Condor 7.0.1.
 
BTW, it seems lot of Condor SOAP tutorials, presentations and BirdBath.jar are based on older version of Condor (6.7.x). The later versions change the WSDL a little bit so some examples and utility class for SOAP are not valide anymore.
 
Thanks and regards,
 
Zhifeng





> Date: Mon, 7 Apr 2008 12:30:18 -0300
> From: mattlistas@xxxxxxxxx
> To: condor-users@xxxxxxxxxxx
> Subject: Re: [Condor-users] How to submit a job via SOAP API
>
> Hi Zhifeng,
>
> You also need to pass in Condor's internal names instead of those in
> the file descriptors. In particular, you need to use UserLog, Out, and
> Err for the ones you wanted.
>
> I recommend you check out:
> http://www.google.com/search?hl=en&q=condor+birdbath+site%3Acommunitygrids.blogspot.com&btnG=Search
>
> In particular http://communitygrids.blogspot.com/2007/12/file-retrieval-with-birdbath-and-condor.html
>
> The author of this blog has made some very interesting notes on using
> the Condor SOAP API.
>
> Cheers,
>
> Matias
>
>
> On Mon, Apr 7, 2008 at 9:54 AM, Matthew Farrellee <matt@xxxxxxxxxxx> wrote:
> > You need to add those to the job ad before submission.
> >
> >
> > Best,
> >
> >
> > matt
> >
> > Zhifeng Yu wrote:
> > > Thanks Matt, I made some progress after reading the references.
> > >
> > > Assuming the submission file is:
> > >
> > > Universe = java
> > > Executable = Simple.class
> > > Arguments = Simple 4 10 Log = simple.log
> > > Output = simple.out
> > > Error = simple.error
> > > Requirement = OpSys=="LINUX"
> > > Queue
> > > How do you map these parameters to the web service method call?And the web service method call is:
> > > /** * Service definition of function condor__createJobTemplate */ public condor.ClassAdStructAndStatus createJobTemplate(int clusterId, int jobId, java.lang.String owner, condor.UniverseType type, java.lang.String cmd, java.lang.String args, java.lang.String requirements) throws java.rmi.RemoteException;
> > >
> > > I figured out I can call this way createJobTemplate(clusterId, jobId, "owner", Universe.java, "Simple.class", "4 10", "Opsys=\"LINUX\"").
> > >
> > > But how to define " Log = simple.log Output = simple.out Error = simple.error" in the method call. Or there is any other place to specify these parameters?
> > >
> > > Thanks and regards,
> > >
> > > Zhifeng
> > >
> > >
> > >
> > >
> > >
> > >> Date: Fri, 28 Mar 2008 17:19:53 -0500> From: matt@xxxxxxxxxxx> To: condor-users@xxxxxxxxxxx> Subject: Re: [Condor-users] How to submit a job via SOAP API> > Zhifeng,> > Please try the "Tutorial: Developer APIs to Condor" from Condor Week 2006> > http://www.cs.wisc.edu/condor/CondorWeek2006/presentations.html> > Best,> > > matt> > Zhifeng Yu wrote:> > I am new to Condor. I have been able to successfully set up a> > personal Condor (version 7.0.0), submit and run some simple jobs of> > Java and C program via command line. Then I attempted to submit jobs> > via SOAP client written in Java by following the IBM tutorial> > article. It seems condor received the job but always put the job on> > "idle",> > > > Here are the java code I used to submit a job:> > > > files[0] = "/workspace/condor/jobs/submit.java"; > > WebServicesHelper.submitJobHelper(schedd, "aa0586",> > UniverseType.JAVA, "java", "Simple 4 10", null, files);> > > > and submit.java is the file which works fine w!
> ith
> > command> > "condor_submit submit.java", The content of the file is shown as> > below:> > > > Universe = java Executable = Simple.class Arguments = Simple 4 10 > > Log = simple.log Output = simple.out Error => > simple.error Queue> > > > Can any one tell me how I should pass parameters to> > WebServicesHelper.submitJobHelper()? I beleive this source code is> > provided by Condor group with method sigature like:> > > > public static void submitJobHelper(CondorScheddPortType schedd, > > String owner, UniverseType type, String cmd, String args, String> > requirements, String[] files) throws JobSubmissionException, > > SendFileException, java.io.IOException, java.rmi.RemoteException { }> > > > I also provided the log file below for analysis.> > > > Thanks and regards,> > > > Zhifeng> > > > > > -- Submitter: localhost.localdomain : : localhost.localdomain ID> > OWNER/NODENAME SUBMITTED RUN_TIME ST PRI SIZE CMD 9.0 aa0586> > 3/19 21:58 0+00:00:00 I 0 0.0 java Simple 4 10 1 jobs;!
> 1 i
> > dle,> > 0 running, 0 held> > > > Negotiator.log, it seems that negotiation is aborted in the middle> > as,> > > > 3/19 22:05:33 ---------- Started Negotiation Cycle ---------- 3/19> > 22:05:33 Phase 1: Obtaining ads from collector ... 3/19 22:05:33> > Getting all public ads ... 3/19 22:05:33 Sorting 6 ads ... 3/19> > 22:05:33 Getting startd private ads ... 3/19 22:05:33 Got ads: 6> > public and 2 private 3/19 22:05:33 Public ads include 1 submitter, 2> > startd 3/19 22:05:33 Phase 2: Performing accounting ... 3/19> > 22:05:33 Phase 3: Sorting submitter ads by priority ... 3/19> > 22:05:33 Phase 4.1: Negotiating with schedds ... 3/19 22:05:33> > Negotiating with aa0586@localdomain at 3/19 22:05:33 0 seconds so far> > 3/19 22:05:33 Request 00009.00000: 3/19 22:05:33 Matched> > 9.0 aa0586@localdomain preempting none slot1@xxxxxxxxxxxxxxxxxxxxx > > 3/19 22:05:33 Successfully matched with> > slot1@xxxxxxxxxxxxxxxxxxxxx 3/19 22:05:33 Got NO_MORE_JOBS; done> > negotiating 3/19 2!
> 2:05
> >
> > :33 ---------- Finished Negotiation Cycle> > ----------> > > > And starter.log indicates signal error:> > > > 3/19 22:05:33 slot1: match_info called 3/19 22:05:33 slot1: Received> > match #1205981602#4#... 3/19 22:05:33 slot1: State change: match> > notification protocol successful 3/19 22:05:33 slot1: Changing state:> > Unclaimed -> Matched 3/19 22:05:33 slot1: Request accepted. 3/19> > 22:05:33 slot1: Remote owner is aa0586@localdomain 3/19 22:05:33> > slot1: State change: claiming protocol successful 3/19 22:05:33> > slot1: Changing state: Matched -> Claimed 3/19 22:05:35 slot1: Got> > activate_claim request from shadow () 3/19 22:05:36 slot1: Remote job> > ID is 9.0 3/19 22:05:36 slot1: Got universe "JAVA" (10) from request> > classad 3/19 22:05:36 slot1: State change: claim-activation protocol> > successful 3/19 22:05:36 slot1: Changing activity: Idle -> Busy 3/19> > 22:05:36 slot1: Called deactivate_claim_forcibly() 3/19 22:05:36> > attempt to connect to failed: Conn!
> ect
> > ion refused (connect errno => > 111). 3/19 22:05:36 Send_Signal: ERROR sending signal 3 (SIGQUIT) to> > pid 3517 (still alive) 3/19 22:05:36 slot1: Error sending signal to> > starter, errno = 25 (Inappropriate ioctl for device) 3/19 22:05:37> > Starter pid 3517 exited with status 4 3/19 22:05:37 slot1: State> > change: starter exited 3/19 22:05:37 slot1: Changing activity: Busy> > -> Idle 3/19 22:05:37 slot1: State change: received RELEASE_CLAIM> > command 3/19 22:05:37 slot1: Changing state and activity:> > Claimed/Idle -> Preempting/Vacating 3/19 22:05:37 slot1: State> > change: No preempting claim, returning to owner 3/19 22:05:37 slot1:> > Changing state and activity: Preempting/Vacating -> Owner/Idle 3/19> > 22:05:37 slot1: State change: IS_OWNER is false 3/19 22:05:37 slot1:> > Changing state: Owner -> Unclaimed> > > > > > And shadow file looks like: 3/19 22:05:35> > ****************************************************** 3/19 22:05:35> > ** condor_shadow (CONDOR_SHA!
> DOW)
> > STARTING UP 3/19 22:05:35 **> > /usr/local/condor/sbin/condor_shadow 3/19 22:05:35 ** $CondorVersion:> > 7.0.0 Jan 22 2008 BuildID: 72173 $ 3/19 22:05:35 ** $CondorPlatform:> > I386-LINUX_RHEL3 $ 3/19 22:05:35 ** PID = 3516 3/19 22:05:35 ** Log> > last touched 3/19 21:55:41 3/19 22:05:35> > ****************************************************** 3/19 22:05:35> > Using config source: /usr/local/condor/etc/condor_config 3/19> > 22:05:35 Using local config sources: 3/19 22:05:35> > /home/aa0586/pool/condor_config.local 3/19 22:05:35 DaemonCore:> > Command Socket at 3/19 22:05:35 Initializing a JAVA shadow for job> > 9.0 3/19 22:05:36 (9.0) (3516): Request to run on was ACCEPTED 3/19> > 22:05:36 (9.0) (3516): ReliSock::put_file_with_permissions(): Failed> > to stat file '/home/aa0586/pool/spool/cluster9.proc0.subproc0/java':> > No such file or directory (errno: 2, si_error: 1) 3/19 22:05:36 (9.0)> > (3516): DoUpload: (Condor error code 13, subcode 2) SHADOW at> > 192.168.0.20!
> fai
> > led to send file(s) to : error reading from> > /home/aa0586/pool/spool/cluster9.proc0.subproc0/java: (errno 2) No> > such file or directory; STARTER failed to receive file(s) from 3/19> > 22:05:36 (9.0) (3516): Job 9.0 going into Hold state (code 13,2):> > Error from starter on slot1@xxxxxxxxxxxxxxxxxxxxx: STARTER failed to> > receive file(s) from 3/19 22:05:36 (9.0) (3516): ZKM: setting default> > map to (null) 3/19 22:05:36 (9.0) (3516): **** condor_shadow> > (condor_SHADOW) EXITING WITH STATUS 112 > > _______________________________________________ Condor-users mailing> > list To unsubscribe, send a message to> > condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can> > also unsubscribe by visiting > > https://lists.cs.wisc.edu/mailman/listinfo/condor-users> > > > The archives can be found at: > > https://lists.cs.wisc.edu/archive/condor-users/> _______________________________________________> Condor-users mailing list> To unsubscribe, send a message to c!
> ondo
> > r-users-request@xxxxxxxxxxx with a> subject: Unsubscribe> You can also unsubscribe by visiting> https://lists.cs.wisc.edu/mailman/listinfo/condor-users> > The archives can be found at: > https://lists.cs.wisc.edu/archive/condor-users/
> > >>
> > >> ------------------------------------------------------------------------
> > >>
> > >> _______________________________________________
> > >> Condor-users mailing list
> > >> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> > >> subject: Unsubscribe
> > >> You can also unsubscribe by visiting
> > >> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> > >>
> > >> The archives can be found at:
> > >> https://lists.cs.wisc.edu/archive/condor-users/
> > _______________________________________________
> >
> >
> > Condor-users mailing list
> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/condor-users/
> >
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/