[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Maximum SERVICE's run in local universe



One more thing/correction, we're running 23.3.0.. My brain can't read
right this morning

On Thu, Feb 22, 2024 at 6:42âAM Christopher Phipps
<hawtdogflvrwtr@xxxxxxxxx> wrote:
>
> I should also add that the corresponding logs for each DAG with a
> service that's still running say this:
>
> Warning: DAGMan thinks there are -1 idle jobs, even though the DAG is completed!
> ERROR: Warning is fatal error because of DAGMAN_USE_STRICT setting
> Aborting Dag...
> Writing Rescue DAG to x.rescue001...
> Removing submitted jobs...
> Removing any/all submitted HTCondor jobs...
>
> On Thu, Feb 22, 2024 at 5:43âAM Christopher Phipps
> <hawtdogflvrwtr@xxxxxxxxx> wrote:
> >
> > I forgot to report back on this. It worked perfectly! I have noticed
> > though, that sometimes the service node doesn't end when all of the
> > work associated with the service node completes. In fact, the service
> > job separates from the parent DAG and sits in the running state until
> > you remove it manually. At first I thought it was because the job
> > started and finished so quickly, that it didn't start the service
> > until after the job had been completed, but it's happening with jobs
> > that take the better part of 15 hours to complete, and i've confirmed
> > that the service started far before anyone picked up the work. Have
> > you see this before? Other than writing logic into the service to
> > check regularly for any remaining work, is there another way to force
> > the service to end gracefully when the rest of its dag is done?
> >
> > Also, I forgot to mention last time that i'm running 23.0.3
> >
> > On Tue, Feb 6, 2024 at 2:29âPM Cole Bollig via HTCondor-users
> > <htcondor-users@xxxxxxxxxxx> wrote:
> > >
> > > Hi Christopher,
> > >
> > > Assuming this relates to the DAGMan setup I helped with recently, the change to this would have to be in the Schedd configuration. You just have to set START_LOCAL_UNIVERSE in the AP configuration (host that the Schedd/DAGMan is running on). This defaults to TotalLocalJobsRunning < 200 so something like:
> > >
> > > START_LOCAL_UNIVERSE = TotalLocalJobsRunning < n
> > >
> > > where n is the desired cap on local universe jobs that can run at once on the host. Don't forget to reconfigure condor (i.e. condor_reconfig)
> > >
> > > Cheers,
> > > Cole Bollig
> > > ________________________________
> > > From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Christopher Phipps <hawtdogflvrwtr@xxxxxxxxx>
> > > Sent: Tuesday, February 6, 2024 11:35 AM
> > > To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
> > > Subject: [HTCondor-users] Maximum SERVICE's run in local universe
> > >
> > > Is there a way to increase the number of SERVICE jobs that can be
> > > running at the same time in the local universe? It appears to be
> > > limited by default to 200 and I'd like to increase it slightly.
> > >
> > > Thanks,
> > > Chris
> > > _______________________________________________
> > > HTCondor-users mailing list
> > > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> > > subject: Unsubscribe
> > > You can also unsubscribe by visiting
> > > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> > >
> > > The archives can be found at:
> > > https://lists.cs.wisc.edu/archive/htcondor-users/
> > > _______________________________________________
> > > HTCondor-users mailing list
> > > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> > > subject: Unsubscribe
> > > You can also unsubscribe by visiting
> > > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> > >
> > > The archives can be found at:
> > > https://lists.cs.wisc.edu/archive/htcondor-users/
> >
> >
> >
> > --
> > It will be happened; it shall be going to be happening; it will be was
> > an event that could will have been taken place in the future. Simple
> > as that. ~ Arnold Rimmer
>
>
>
> --
> It will be happened; it shall be going to be happening; it will be was
> an event that could will have been taken place in the future. Simple
> as that. ~ Arnold Rimmer



-- 
It will be happened; it shall be going to be happening; it will be was
an event that could will have been taken place in the future. Simple
as that. ~ Arnold Rimmer