[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Non trivial way of using DAG



Hi Lorenzo,

This is a very interesting workflow to automate. I do have some questions regarding this:
  1. What is the upper limit on the number of times this workflow can cycle? I assume it's a large number, but there must be some number of cycles you expect this work to be finished.
  2. Are all N data subset computational jobs identical except for minor variations (different input/output files, arguments, etc)?
  3. How important is it for you to know which iteration of the cycle is on?
-Cole Bollig

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Lorenzo Mobilia <l.mobilia@xxxxxxxxxxxxxxxx>
Sent: Wednesday, October 25, 2023 2:45 AM
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] Non trivial way of using DAG
 
Hi, 

I am finding some difficulties in using DAG. Basically I need to 

1. Take a dataset D1
2. Split it in N subdataset
3. Perform some computation in these N subdataset
4. Merge these subdataset in another dataset D2
5. Restart the process (back to point 1) now using D2

And continuing until specific characteristics have been achieved by the final dataset. The problems are:

A. I don't know a priori how many times I need to split D1
B. I don't know a priori how many times I need to perform this cycle

The solution I came up with is to build a main which controls this flow, but after some cycles it crashes. 

If anyone has some suggestions or is interested in this problem in order to have some other information, please let me know!

Hi, 

Lorenzo