Re: [HTCondor-users] DAGMan low submission rate

Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

On 27 June 2017 at 14:49, John M Knoeller <johnkn@xxxxxxxxxxx> wrote:

in 8.6, condor_submit has supportÂ for retrying a job that failed, just like dagman does. The only thing that dagman can do there that condor_submit canât do it pre and post scripts, is that what you mean?

Â

-tj

Â

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Benedikt Riedel
Sent: Tuesday, June 27, 2017 2:42 PM

To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] DAGMan low submission rate

Â

Hi,

Â

By job monitoring, I mean which jobs in my set have failed, succeeded, etc. With 1.2M nodes, it can quickly become a pain to do all this outside of DAGMan.

Â

Benedikt

Â

On 27 June 2017 at 14:33, John M Knoeller <johnkn@xxxxxxxxxxx> wrote:

Can you clarify what you mean by âjob monitoring?â.

Â

Â

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Benedikt Riedel
Sent: Tuesday, June 27, 2017 10:31 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] DAGMan low submission rate

Â

Hi,

Â

This might be tangential. DAGMan and late materialization are not compatible at the moment. With late materialization users seemingly have to go off and write their own job monitoring code. DAGMan provides job monitoring capability for free.ÂAm I missing something? What is the best practice for monitoring large independent job sets when using late materialization?

Â

As for Henning's users issue, have you tried settingÂ

Â

DAGMAN_USER_LOG_SCAN_INTERVAL = 1

Â

This appears to increase the submission rate.

Â

Thanks,

Â

Benedikt

Â

Â

Â

On 27 June 2017 at 09:53, Greg Thain <gthain@xxxxxxxxxxx> wrote:

On 06/27/2017 01:30 AM, Henning Fehrmann wrote:

Hello,

one of our users started a DAG with 1.2M nodes which do not depend on
other nodes. It seems that in average 120 jobs are submitted per
minute. This number is strongly fluctuating.

If there are no dependencies between nodes, and you aren't using other DAGman features like pre/post scripts, would you consider upgrading to 8.7, to use the late materialization feature instead of dagman?Â It was designed exactly for this use case.

-greg

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

Â

--

Benedikt Riedel
Scientific Programmer
University of Chicago
Computation Institute

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

Â

--

Benedikt Riedel
Scientific Programmer
University of Chicago
Computation Institute

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

Benedikt Riedel
Scientific Programmer
University of Chicago
Computation Institute

Mailing List Archives

Public Access

Re: [HTCondor-users] DAGMan low submission rate