[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] What can hinder condor_startd to set DISK?



Hi again,

as for the MOUNT_UNDER_SCRATCH extension, maybe the strcat function could work, e.g.,
  strcat(MOUNT_UNDER_SCRATCH,"/something/else")

Cheers,
  Thomas


On 21/04/2023 13.31, Steffen Grunewald wrote:
On Fri, 2023-04-21 at 11:18:17 +0200, Steffen Grunewald wrote:
Hi,

On Fri, 2023-04-21 at 11:10:20 +0200, Thomas Hartmann wrote:
Hi Steffen,

I guess
   RESERVED_DISK = 131072
might be the culprit. I just checked in the documentation and the ad is in
MB, i.e., the reservation for non-condor stuff of ~131GB would surpass the
available 124G on your /var (unfortunately, prefixes/sizes are sometimes a
bit inconsistent)

Hm, my printed copy of the manual (10.0.0) must be wrong then. It has "(in kB)"
for both DISK and RESERVED_DISK - while the unit for RESERVED_SWAP is "MiB".

Also, matching is done by comparing TARGET.RequestDisk and DISK without any
unit conversions, so the JOB_DEFAULT_REQUESTDISK would be affected as well?

Previously I had a very low setting - which I'll restore now.


Another thing I noticed - is the execute directory on a dedicated volume?
Else
   STARTD_RECOMPUTE_DISK_FREE = false
might be a problem in cases, where /var get filled by other processes (like
logs) and the available disk space shrinks for jobs as well.

Since my partitionable slot gets only 75% of the total disk I'm not worried
about that, and there will be a watchdog checking for disk (partition)
shortages.

Thanks so far, I'll report about the outcome,

.... and here it is, from a different node though.
I have set
	RESERVED_DISK = 128
and the /var filesystem reports 129177320 kB free.
 From "condor_status -l ... | grep Disk" I get
	TotalDisk = 129046416
	Disk = 96784812
- the latter being exactly 75% of the total space, as configured.
The difference between the free capacity and the TotalDIsk value is 130904,
which is close to 131072 (but not identical), meaning that RESERVED_DISK
is indeed multiplied by 1024 to get MB (the same as RESERVED_SWAP), and the
entry in my 10.0.0 manual (subsection 4.5.1, p.209) is wrong - but has been
fixed in the online version for 10.0.3. Lesson learned...
(BTW a negative value would have given me a wink to look closer...)

I'm now trying to get a grip on =?=/=!= expressions and a means to extend
MOUNT_UNDER_SCRATCH (for the latter, "$(MOUNT_UNDER_SCRATCH),/something/else"
will produce unexpected results), but the major issue is fixed it seems.

Thanks for your suggestions!

- Steffen

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature