[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] DAG submission using Condor SOAP webservice



unsubscribe


From: "Belai Beshah"belai.beshah@xxxxxxxxx
Sent:Mon, 18 Jun 2012 06:50:59 +0530
To: Matthew Farrellee matt@xxxxxxxxxx, Condor-Users Mail List condor-users@xxxxxxxxxxx
Subject: Re: [Condor-users] DAG submission using Condor SOAP webservice
Thanks Matt for the suggestion. Tried it using simple ping job and the result are as follows:

>

> 1. If done using out of the box windows install with the dynamic " condor-reuse-slot1" user, it fails when it tries to submit with the same error as the SAOP("ERROR: No credential stored for condor-reuse-slot1@..."). It looks like the problem is with the submitter machine being able to use the dynamic credentials of " condor-reuse-slot1"

>

> 2. If we change the submitter and node to run with a specific domain user by adding the following configuration to the "condor_config.local" to map the slots as " SLOT1_user = <domain user>" and made sure that this domain user has its credentials stored using " condor_store_cred add" on both the submit and runner nodes. We get a different error as " Failed to initialize user log to C:\condor\execute\dir_3680\ping-A-141.log " . Don't know why it can't create the log since we have verified that domain user has full read/write permission on the condor execution directory.

>

> Got the same error as above with or without "-remote". As always submitting Job B by hand on the submitter machine works ok in both configuration. We have been trying different ideas for a Job submitting another Job problem and are wondering if this problem is specific to clipped windows version or a not supported use case in general ? Any solution or suggestion to try is highly appreciated since we have exhausted all the options we can think of.

>

> Thanks

> Belai

>

>

> -----Original Message-----

> From: Matthew Farrellee [mailto:matt@xxxxxxxxxx]

> Sent: Sunday, June 17, 2012 7:40 AM

> To: Condor-Users Mail List

> Cc: Belai Beshah

> Subject: Re: [Condor-users] DAG submission using Condor SOAP webservice

>

> If you condor_submit Job A (no SOAP), can Job A submit Job B?

>

> Try with and without -remote.

>

> Best,

>

>

> matt

>

> On 06/15/2012 06:24 PM, Belai Beshah wrote:

> > Looks like the problem is not DAG, it is actually any kind of job submission from another Job. So submitting Job A using SOAP Web Service and having it run on a Submitter machine and try to submit another Job B fails with "ERROR: No credential stored for condor-reuse-slot1@...".

> > Is there something fundamental we are missing or Is it that Condor Job submitting another Job is not supported in windows ?

> >

> > -----Original Message-----

> > From: Belai Beshah

> > Sent: Thursday, June 14, 2012 1:56 PM

> > To: 'nwp@xxxxxxxxxxx'; Condor-Users Mail List

> > Subject: RE: [Condor-users] DAG submission using Condor SOAP

> > webservice

> >

> > The problem is with automatic running of the DAG on the submit machine. The SOAP is working as expected and does submit and start Job A correctly.

> >

> > I will like to try your suggestion but there are a couple of hurdles we have not yet built a means of submitting a DAG using SOAP and also the modification to the application that submits the dag will require changes by another group. I am still not sure it will work though since it will still require being able to submit a DAG Job from the submitter machine without a human being logged into that machine.

> >

> > -----Original Message-----

> > From: condor-users-bounces@xxxxxxxxxxx

> > [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Nathan Panike

> > Sent: Thursday, June 14, 2012 9:55 AM

> > To: Condor-Users Mail List

> > Subject: Re: [Condor-users] DAG submission using Condor SOAP

> > webservice

> >

> > On Thu, Jun 14, 2012 at 08:39:53AM -0400, Belai Beshah wrote:

> >> Thanks Todd for the suggestion but looking at the two pointer

> >> carefully our problem seems different. What we have is we submit let

> >> say Job A using SOAP and when Job A runs it calculates the

> >> dependencies required based on the input dataset and tries to submit

> >> a DAG let say Job B. From the outside java client that uses the SOAP

> >> API we don't know what the DAG dependencies are and cannot directly

> >> generate the DAG for Job B, so we have to submit the Vanilla Universe

> >> Job A to figure the dependencies out and submit a DAG job using not

> >> the SOAP api but directly calling "condor_submit_dag" when it runs.

> >> The problem we are facing is a Vanilla universe job running

> >> successfully on a submitter machine is not able to submit a DAG. We

> >> can see that the DAG files get created correctly, it is only the

> >> "condor_submit_dag" that is failing. If we log to the submit machine

> >> we can manually submit the DAG file created by Job B successfully.

> >>

> >

> > It seems like what you want to do is have a dag as follows JOB A

> > create_analysis.cmd SUBDAG EXTERNAL B analysis.dag PARENT A CHILD B

> >

> > If a pre script is appropriate here, use:

> > SUBDAG EXTERNAL B analysis.dag

> > SCRIPT PRE B create_analysis

> >

> > A will compute the necessary dependencies and prepare the analysis dag for B.

> >

> > I am confused whether this is a SOAP problem or a DAG problem.

> >

> > Nathan Panike

> >

> >> We tried forcing condor to use a specific user all the time on the submit machines by adding " SLOT1_user =<UID domain user>" for a domain user that has its cred stored but still no luck.

> >>

> >>

> >> -----Original Message-----

> >> From: Todd Tannenbaum [mailto:tannenba@xxxxxxxxxxx]

> >> Sent: Wednesday, June 13, 2012 4:43 PM

> >> To: Condor-Users Mail List

> >> Cc: Belai Beshah

> >> Subject: Re: [Condor-users] DAG submission using Condor SOAP

> >> webservice

> >>

> >> After a quick read of the below, it sounds as if you are submitting DAGMan as a vanilla universe job and therefore having it run under the starter on an execute node. Typically DAGMan is run as a scheduler universe job under the schedd, no startd involved.

> >>

> >> Essentially your soap submission should submit a scheduler universe job to a schedd, the executable should be dagman, and you should move all files (dagman input file, all the node submit files, and all the files referenced in those submit files) via the SOAP file staging calls.

> >>

> >> Some additional pointers:

> >>

> >> http://spinningmatt.wordpress.com/2009/11/02/submitting-a-workflow-to

> >> -

> >> condor-via-soap-using-java/

> >>

> >> Also perhaps helpful is this URL, which talks about how to submit dagman into a grid universe (you'll want to use SOAP to the scheduler universe, but several of the concepts/ideas are the same):

> >>

> >> https://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=DagManUnderCondorc

> >>

> >> Hope this helps,

> >> Todd

> >>

> >> On 6/13/2012 5:55 AM, Belai Beshah wrote:

> >>> Hi All,

> >>>

> >>> We have a simple out of the box Condor 7.6.7 on Windows 7 with the

> >>> SOAP Interface configured and SOAP READ/WRITE access given to

> >>> everybody on the network. Submitting simple jobs from other

> >>> machines using SOAP is working great. However, we are facing problem

> >>> when we try to submit a job that tries to submit a DAG. What we have

> >>> done is to make the submitter machines advertise as having DAG

> >>> capabilities and the jobs that try to submit DAG jobs to require

> >>> that so that these jobs run only on the submitter machines. Here is the part of " condor_config.local"

> >>> for this setup:

> >>>

> >>> # Added to the submit machine to ONLY accept jobs that require DAG

> >>> and advertise this machine as having DAG

> >>>

> >>> HAS_CONDOR_DAG = True

> >>>

> >>> STARTD_ATTRS = HAS_CONDOR_DAG, $(STARTD_ATTRS)

> >>>

> >>> START = (JOB_GROUP =?= "REQ_CONDOR_DAG")&& $(START)

> >>>

> >>> The matching of Jobs to submit machines is happening correctly but

> >>> when the job starts to run it errors out with the message:

> >>>

> >>> ERROR: No credential stored for condor-reuse-slot1@PT-MASTER

> >>>

> >>> Correct this by running:

> >>>

> >>> condor_store_cred add

> >>>

> >>> ERROR: condor_submit failed; aborting

> >>>

> >>> According to the docs the starter will assign a new randomly

> >>> generated password to the "condor-reuse-slot1" account, so storing a credential

> >>> by hand will not be a solution. We are using the "run_as_owner" flag

> >>> for the simple jobs. Is there a way to tell the DAG jobs to run as

> >>> owner without going to the extra step of generating the dag using

> >>> "condor_submit_dag -no_submit" and somehow editing the resulting

> >>> DAG (which is very difficult since the machines submitting using

> >>> SOAP don't have access to the submitter machines file system).

> >>>

> >>> Thanks

> >>>

> >>> Belai

> >>>

> >>>

> >>>

> >>> _______________________________________________

> >>> Condor-users mailing list

> >>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx

> >>> with a

> >>> subject: Unsubscribe

> >>> You can also unsubscribe by visiting

> >>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users

> >>>

> >>> The archives can be found at:

> >>> https://lists.cs.wisc.edu/archive/condor-users/

> >>>

> >>

> >>

> >> --

> >> Todd Tannenbaum<tannenba@xxxxxxxxxxx> University of Wisconsin-Madison

> >> Center for High Throughput Computing Department of Computer Sciences

> >> Condor Project Technical Lead 1210 W. Dayton St. Rm #4257

> >> Phone: (608) 263-7132 Madison, WI 53706-1685

> >>

> >>

> >> _______________________________________________

> >> Condor-users mailing list

> >> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx

> >> with a

> >> subject: Unsubscribe

> >> You can also unsubscribe by visiting

> >> https://lists.cs.wisc.edu/mailman/listinfo/condor-users

> >>

> >> The archives can be found at:

> >> https://lists.cs.wisc.edu/archive/condor-users/

> >

> > --

> > Nathan Panike, nwp@xxxxxxxxxxx

> > Scientific Programmer

> > UW-Madison Center for High Throughput Computing Computer Sciences

> > Department, Room 4280

> > 1210 W. Dayton St.

> > Madison, WI 53706 USA

> > 608.890.0032

> >

> > Scientific Programmer

> > Laboratory for Molecular and Computational Genomics Biotechnology

> > Center

> > 425 Henry Mall, room 5445

> > Madison, WI 53706 USA

> > 608.890.0086

> > _______________________________________________

> > Condor-users mailing list

> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx

> > with a

> > subject: Unsubscribe

> > You can also unsubscribe by visiting

> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users

> >

> > The archives can be found at:

> > https://lists.cs.wisc.edu/archive/condor-users/

> > _______________________________________________

> > Condor-users mailing list

> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx

> > with a

> > subject: Unsubscribe

> > You can also unsubscribe by visiting

> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users

> >

> > The archives can be found at:

> > https://lists.cs.wisc.edu/archive/condor-users/

>

> _______________________________________________

> Condor-users mailing list

> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a

> subject: Unsubscribe

> You can also unsubscribe by visiting

> https://lists.cs.wisc.edu/mailman/listinfo/condor-users

>

> The archives can be found at:

> https://lists.cs.wisc.edu/archive/condor-users/

>

Follow Rediff Deal ho jaye! to get exciting offers in your city everyday.