[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Dagman Issue V9

Hi David,

Can you describe the dag where this is happening (or better yet, send
me the .dag file)? When you mention a child dag, are you talking about
an external subdag or something different?

By default every dag is supposed to have these attributes in its
classad. I just did a quick test to verify this. So I'm wondering if
there's something special about your environment causing it to not be

Are you setting a custom value for max jobs? (either with the
DAGMAN_MAX_JOBS_SUBMITTED configuration knob or the -maxjobs submit


On Mon, Jun 7, 2021 at 6:54 AM <duduhandelman@xxxxxxxxxxx> wrote:
> Hi Again,
> I forgot to mention it's happening on dag that submit dag only the child dag are effected.
> Also, the child dag does not have those classads. which I don't know if it's ok or not.
> DAGMan_MaxIdle
> DAGMan_MaxJobs
> DAGMan_MaxPreScripts
> DAGMan_MaxPostScripts
> DAGMan_MaxHoldScripts
> Thanks Again,
> David
> ________________________________
> From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of duduhandelman@xxxxxxxxxxx <duduhandelman@xxxxxxxxxxx>
> Sent: 07 June 2021 14:29
> To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
> Subject: [HTCondor-users] Dagman Issue V9
> Hi All,
> A week ago I have upgrade to condor 9.0.1 from 8.8 I'm facing an issue with Dagman Jobs,
> Most of the jobs running as expected but some DAGMan are not submitting jobs after a while.
> It seems that Dagman job is asking for DagMan_Max_jobs and sometimes gets a positive value but sometimes gets negative number and that causing the issue I assume.
> The Sched debug print:
> GetAttributeInt(968372, 0 , DAGMAN_MaxJobs) not found.
> The Dag output display every few minutes:
> Warning: failed to get attribute DAGMan_MaxIdle
> Warning: failed to get attribute DAGMan_MaxJobs
> Warning: failed to get attribute DAGMan_MaxPreScripts
> Warning: failed to get attribute DAGMan_MaxPostScripts
> Warning: failed to get attribute DAGMan_MaxHoldScripts
> It seems like the value is garbage, probably not initialized.
> Any clues? can it be a security issue?
> Many Thanks
> David
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/

Mark Coatsworth
Systems Programmer
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin-Madison