[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Dagman Issue V9



Hi Again,
I forgot to mention it's happening on dag that submit dag only the child dag are effected.

Also, the child dag does not have those classads. which I don't know if it's ok or not.

DAGMan_MaxIdle
DAGMan_MaxJobs
DAGMan_MaxPreScripts
DAGMan_MaxPostScripts
DAGMan_MaxHoldScripts

Thanks Again,
David

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of duduhandelman@xxxxxxxxxxx <duduhandelman@xxxxxxxxxxx>
Sent: 07 June 2021 14:29
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] Dagman Issue V9
 
Hi All,
A week ago I have upgrade to condor 9.0.1 from 8.8 I'm facing an issue with Dagman Jobs,
Most of the jobs running as expected but some DAGMan are not submitting jobs after a while.
It seems that Dagman job is asking for DagMan_Max_jobs and sometimes gets a positive value but sometimes gets negative number and that causing the issue I assume.

The Sched debug print:
GetAttributeInt(968372, 0 , DAGMAN_MaxJobs) not found.

The Dag output display every few minutes:
Warning: failed to get attribute DAGMan_MaxIdle
Warning: failed to get attribute DAGMan_MaxJobs
Warning: failed to get attribute DAGMan_MaxPreScripts
Warning: failed to get attribute DAGMan_MaxPostScripts
Warning: failed to get attribute DAGMan_MaxHoldScripts


It seems like the value is garbage, probably not initialized.
Any clues? can it be a security issue?

Many Thanks
David