[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] DAG submission using Condor SOAP webservice



The problem is with automatic running of the DAG on the submit machine. The SOAP is working as expected and does submit and start Job A correctly.

I will like to try your suggestion but there are a couple of hurdles we have not yet built a means of submitting a DAG using SOAP and also the modification to the application that submits the dag will require changes by another group. I am still not sure it will work though  since it will still require being able to submit a DAG Job from the submitter machine without a human being logged into that machine.

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Nathan Panike
Sent: Thursday, June 14, 2012 9:55 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] DAG submission using Condor SOAP webservice

On Thu, Jun 14, 2012 at 08:39:53AM -0400, Belai Beshah wrote:
> Thanks Todd for the suggestion but looking at the two pointer 
> carefully our problem seems different. What we have is  we submit let 
> say Job A using  SOAP and when Job A runs it calculates the 
> dependencies required based on the input dataset  and tries to submit 
> a DAG let say Job B.  From the outside java client that uses the SOAP 
> API we don't know what the DAG dependencies are and cannot directly 
> generate the DAG for Job B, so we have to submit the Vanilla Universe 
> Job A to figure the dependencies out and submit a DAG job using not 
> the SOAP api but directly calling "condor_submit_dag" when it runs.
> The problem we are facing is a Vanilla universe job running 
> successfully on a submitter machine is not able to submit a DAG. We 
> can see that the DAG files get created correctly, it is only the 
> "condor_submit_dag" that is failing. If we log to the submit machine 
> we can manually submit the DAG file created by Job B successfully.
> 

It seems like what you want to do is have a dag as follows JOB A create_analysis.cmd SUBDAG EXTERNAL B analysis.dag PARENT A CHILD B

If a pre script is appropriate here, use:
SUBDAG EXTERNAL B analysis.dag
SCRIPT PRE B create_analysis

A will compute the necessary dependencies and prepare the analysis dag for B.

I am confused whether this is a SOAP problem or a DAG problem.

Nathan Panike

> We tried forcing condor to use a specific user all the time on the submit machines by adding " SLOT1_user = <UID domain user>" for a domain user that has its cred stored but still no luck.
> 
> 
> -----Original Message-----
> From: Todd Tannenbaum [mailto:tannenba@xxxxxxxxxxx]
> Sent: Wednesday, June 13, 2012 4:43 PM
> To: Condor-Users Mail List
> Cc: Belai Beshah
> Subject: Re: [Condor-users] DAG submission using Condor SOAP 
> webservice
> 
> After a quick read of the below, it sounds as if you are submitting DAGMan as a vanilla universe job and therefore having it run under the starter on an execute node. Typically DAGMan is run as a scheduler universe job under the schedd, no startd involved.
> 
> Essentially your soap submission should submit a scheduler universe job to a schedd, the executable should be dagman, and you should move all files (dagman input file, all the node submit files, and all the files referenced in those submit files) via the SOAP file staging calls.
> 
> Some additional pointers:
>  
> http://spinningmatt.wordpress.com/2009/11/02/submitting-a-workflow-to-
> condor-via-soap-using-java/
> 
> Also perhaps helpful is this URL, which talks about how to submit dagman into a grid universe (you'll want to use SOAP to the scheduler universe, but several of the concepts/ideas are the same):
>     
> https://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=DagManUnderCondorc
> 
> Hope this helps,
> Todd
> 
> On 6/13/2012 5:55 AM, Belai Beshah wrote:
> > Hi All,
> >
> > We  have a simple out of the box Condor 7.6.7 on Windows 7 with the 
> > SOAP Interface configured and SOAP READ/WRITE access given to 
> > everybody on the network.  Submitting simple jobs from other 
> > machines using SOAP is working great. However, we are facing problem 
> > when we try to submit a job that tries to submit a DAG. What we have 
> > done is to make the submitter machines advertise as having DAG 
> > capabilities and the jobs that try to submit DAG jobs to require 
> > that so that these jobs run only on the submitter machines.  Here is the part of " condor_config.local"
> > for this setup:
> >
> > # Added to the submit machine to ONLY accept jobs that require DAG 
> > and advertise this machine as having DAG
> >
> > HAS_CONDOR_DAG = True
> >
> > STARTD_ATTRS = HAS_CONDOR_DAG, $(STARTD_ATTRS)
> >
> > START = (JOB_GROUP =?= "REQ_CONDOR_DAG") && $(START)
> >
> > The matching of Jobs to submit machines is happening correctly but 
> > when the job starts to run it errors out with the message:
> >
> > ERROR: No credential stored for condor-reuse-slot1@PT-MASTER
> >
> > Correct this by running:
> >
> > condor_store_cred add
> >
> > ERROR: condor_submit failed; aborting
> >
> > According to the docs the  starter will assign a new randomly 
> > generated password to the "condor-reuse-slot1"  account, so storing a credential
> > by hand will not be a solution.   We are using the "run_as_owner" flag
> > for the simple jobs. Is there a way to tell the DAG jobs to run as 
> > owner without going to the extra step of generating the dag using 
> > "condor_submit_dag -no_submit"  and somehow editing the resulting 
> > DAG (which is very difficult since the machines submitting using 
> > SOAP don't have access to the submitter machines file system).
> >
> > Thanks
> >
> > Belai
> >
> >
> >
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx 
> > with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting 
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/condor-users/
> >
> 
> 
> --
> Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
> Center for High Throughput Computing   Department of Computer Sciences
> Condor Project Technical Lead          1210 W. Dayton St. Rm #4257
> Phone: (608) 263-7132                  Madison, WI 53706-1685
> 
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx 
> with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/

--
Nathan Panike, nwp@xxxxxxxxxxx
Scientific Programmer
UW-Madison Center for High Throughput Computing Computer Sciences Department, Room 4280
1210 W. Dayton St.
Madison, WI 53706 USA
608.890.0032

Scientific Programmer
Laboratory for Molecular and Computational Genomics Biotechnology Center
425 Henry Mall, room 5445
Madison, WI 53706 USA
608.890.0086
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/