[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Files in /var/lib/condor/execute/ filling up disk



On Thu, Jun 13, 2019 at 12:50 PM John G Heim <jheim@xxxxxxxxxxxxx> wrote:
>
> Well, as I said, I think the condor daemons crashed because the disk was
> full. And I think the disk was full because condor filled it. But it
> does not seem to be happening again. This may be an edge condition. I am
> not entirely sure the central manager was right. I am just going to
> forget about it unless it happens again.
>
>If it does happen again, or if you want to get ahead of it, there are
two approaches I would suggest:

1. Keep the execute directory on a separate partition. This may or may
not be easy to do on your existing machines.

2. Set a policy to evict jobs when the disk is getting too full. See
the bottom half of
https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToLimitDiskUsageOfJobs

Full disclosure, I wrote that section, so there's a good chance it's
unreliable. :-) But when I wrote it, we had a cluster with a
significantly smaller total disk space than most of the rest of our
hardware. The caveat here is that I'm not sure how quickly that
attribute gets updated, so if the usage balloons too quickly, you
won't catch it.

-- 
Ben Cotton
He / Him / His
Fedora Program Manager
Red Hat
TZ=America/Indiana/Indianapolis