[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Dagman Issue V9
- Date: Mon, 7 Jun 2021 11:19:58 -0500
- From: Mark Coatsworth <coatsworth@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Dagman Issue V9
Can you describe the dag where this is happening (or better yet, send
me the .dag file)? When you mention a child dag, are you talking about
an external subdag or something different?
By default every dag is supposed to have these attributes in its
classad. I just did a quick test to verify this. So I'm wondering if
there's something special about your environment causing it to not be
Are you setting a custom value for max jobs? (either with the
DAGMAN_MAX_JOBS_SUBMITTED configuration knob or the -maxjobs submit
On Mon, Jun 7, 2021 at 6:54 AM <duduhandelman@xxxxxxxxxxx> wrote:
> Hi Again,
> I forgot to mention it's happening on dag that submit dag only the child dag are effected.
> Also, the child dag does not have those classads. which I don't know if it's ok or not.
> Thanks Again,
> From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of duduhandelman@xxxxxxxxxxx <duduhandelman@xxxxxxxxxxx>
> Sent: 07 June 2021 14:29
> To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
> Subject: [HTCondor-users] Dagman Issue V9
> Hi All,
> A week ago I have upgrade to condor 9.0.1 from 8.8 I'm facing an issue with Dagman Jobs,
> Most of the jobs running as expected but some DAGMan are not submitting jobs after a while.
> It seems that Dagman job is asking for DagMan_Max_jobs and sometimes gets a positive value but sometimes gets negative number and that causing the issue I assume.
> The Sched debug print:
> GetAttributeInt(968372, 0 , DAGMAN_MaxJobs) not found.
> The Dag output display every few minutes:
> Warning: failed to get attribute DAGMan_MaxIdle
> Warning: failed to get attribute DAGMan_MaxJobs
> Warning: failed to get attribute DAGMan_MaxPreScripts
> Warning: failed to get attribute DAGMan_MaxPostScripts
> Warning: failed to get attribute DAGMan_MaxHoldScripts
> It seems like the value is garbage, probably not initialized.
> Any clues? can it be a security issue?
> Many Thanks
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> The archives can be found at:
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin-Madison