[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] DAGMan low submission rate



Hi,

By job monitoring, I mean which jobs in my set have failed, succeeded, etc. With 1.2M nodes, it can quickly become a pain to do all this outside of DAGMan.

Benedikt

On 27 June 2017 at 14:33, John M Knoeller <johnkn@xxxxxxxxxxx> wrote:

Can you clarify what you mean by âjob monitoring?â.

Â

Â

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Benedikt Riedel
Sent: Tuesday, June 27, 2017 10:31 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] DAGMan low submission rate

Â

Hi,

Â

This might be tangential. DAGMan and late materialization are not compatible at the moment. With late materialization users seemingly have to go off and write their own job monitoring code. DAGMan provides job monitoring capability for free.ÂAm I missing something? What is the best practice for monitoring large independent job sets when using late materialization?

Â

As for Henning's users issue, have you tried settingÂ

Â

DAGMAN_USER_LOG_SCAN_INTERVAL = 1

Â

This appears to increase the submission rate.

Â

Thanks,

Â

Benedikt

Â

Â

Â

On 27 June 2017 at 09:53, Greg Thain <gthain@xxxxxxxxxxx> wrote:

On 06/27/2017 01:30 AM, Henning Fehrmann wrote:

Hello,

one of our users started a DAG with 1.2M nodes which do not depend on
other nodes. It seems that in average 120 jobs are submitted per
minute. This number is strongly fluctuating.


If there are no dependencies between nodes, and you aren't using other DAGman features like pre/post scripts, would you consider upgrading to 8.7, to use the late materialization feature instead of dagman? It was designed exactly for this use case.

-greg



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/



Â

--

Benedikt Riedel
Scientific Programmer
University of Chicago
Computation Institute


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/



--
Benedikt Riedel
Scientific Programmer
University of Chicago
Computation Institute