[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Application specific scheduler



Nick,

I would recommend to go with option #2 with the understanding that you need to decide whether step 3 of DAG number n will submit DAG number n+1 as an independent HTCondor job or whether it will create a "nested" DAG so that all jobs will be part of one BIG DAG.

You will also have to keep in mind when the a DAGMan job is restarted as it will play back all the nodes including the nodes that interact with the database.

Miron



On 6/22/2014 10:07 AM, Nick Cooper wrote:
Hi All,

I am currently looking at migrating from our home grown distributed
computing software to HTCondor. Over the years, user have created
complex "job managers" written in C++ which are equivalent to
application specific DAGMan scripts. To reduce the burden on users
migrating to HTCondor we would like to provide an adaptor between a "job
manager" and HTCondor.

An example of a simple "Job Manager" is one which (all within the same
cluster):
1. Requests 1000 simulation jobs to be executed
2. When all 1000 simulation jobs are completed, creates a database and
loads the results into it
3. Does analysis on the results in the database and based on the
analysis requests further simulation jobs to be executed. All without
any user involvement.

 From what I have read our options are:
1. Web Service: Write an adapter using the SOAP interface. I suspect
there is not enough feedback regarding when a job completes / fails.
2. DAGMan: Write an adapter that generates DAGMan scripts.
3. DRMAA: Write an adapter that submits and monitors jobs via the DRMAA API.

Can someone confirm if I am one the correct track?
Does anyone have any suggestions / words of wisdom for this kind of
requirement?

Further info:
- Windows based pool
- Job manager is a C++ DLL
- Looking at using the current stable release of HTCondor
- Jobs will run in the Vanilla Universe
- Jobs will need to be run under the submitters Active Directory credentials

Thanks Nick


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/