[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HAS_CVMFS_something ?



Brian,

On Mon, 2023-04-24 at 12:33:50 +0000, Bockelman, Brian wrote:
> 
> https://github.com/opensciencegrid/osg-flock/blob/master/ospool-pilot/main/pilot/advertise-base#L380-L516
> 
> Feel free to borrow as needed.

It took me the better part of a working day to eventually recognize the first
two dozen lines or so, after thinking how/where to use this.
You see, I'm new to STARTD_CRON stuff (although there is some for GPU monitoring
already running in my config)...

I eventually found https://htcondor.readthedocs.io/en/v10_0/admin-manual/hooks.html#startd-cron-and-schedd-cron-daemon-classad-hooks
but that claims that the hook should output a magic string when done - I could
not find that?

Nevertheless, there's a lot of HAS_CVMFS_* ad entries reported by condor-status -l
now, so I assume I got it working.

A minor question remains: which UID runs this, in which working directory, and
what happens to the stderr output?


Thanks, Steffen

> > On Apr 24, 2023, at 5:17 AM, Steffen Grunewald <steffen.grunewald@xxxxxxxxxx> wrote:
> > 
> > Good morning/afternoon/...,
> > 
> > today, after a rather lengthy upgrade of our HTCondor pool, I found some jobs
> > in "idle" state, despite plenty of resources still available.
> > 
> > Investigation (better_analyze) shows, for one of the affected jobs:
> > 
> > 
> > The Requirements expression for job 27474.000 is
> > 
> >    ((HAS_CVMFS_singularity_opensciencegrid_org is true)) && (TARGET.Arch == "X86_64") && (TARGET.OpSys == "LINUX") && (TARGET.Disk >= RequestDisk) && (TARGET.Memory >= RequestMemory) && (TARGET.Cpus >= RequestCpus) && (TARGET.HasFileTransfer && versioncmp(split(TARGET.CondorVersion)[1],"8.9.7") >= 0) &&
> >    TARGET.HasSelfCheckpointTransfers
> > 
> > Job 27474.000 defines the following attributes:
> > 
> >    RequestCpus = 16
> >    RequestDisk = 10485760
> >    RequestMemory = 40000
> > 
> > The Requirements expression for job 27474.000 reduces to these conditions:
> > 
> >         Slots
> > Step    Matched  Condition
> > -----  --------  ---------
> > [0]           0  HAS_CVMFS_singularity_opensciencegrid_org is true
> > [11]        585  TARGET.HasFileTransfer
> > [12]        585  versioncmp(split(TARGET.CondorVersion)[1],"8.9.7") >= 0
> > 
> > 
> > Older jobs from the same user (and similar DAG) didn't require that HAS_CVMFS_...
> > to be set.
> > 
> > While all machines have CVMFS access, `ls /cvmfs/singularity.opensciencegrid.org`
> > works on all pool nodes, this (never seen before) requirement needs to be fulfilled
> > - but I have no idea whether this has to be done via setting that HAS_CVMFS_... via
> > a configuration file (any suggestion is welcome) or whether HTCondor could take
> > care of it itself (perhaps triggered by another setting in the config).
> > 
> > I've searched the latest documentation and all of my notes, to no avail.
> > 
> > Is there somebody who can cure my (obvious?) blindness?
> > 
> > Thanks,
> > Steffen
> > 
> > -- 
> > Steffen Grunewald, Cluster Administrator
> > Max Planck Institute for Gravitational Physics (Albert Einstein Institute)
> > Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany
> > ~~~
> > Fon: +49-331-567 7274
> > Mail: steffen.grunewald(at)aei.mpg.de
> > ~~~
> > _______________________________________________
> > HTCondor-users mailing list
> > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> > 
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/htcondor-users/
> 
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/

-- 
Steffen Grunewald, Cluster Administrator
Max Planck Institute for Gravitational Physics (Albert Einstein Institute)
Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany
~~~
Fon: +49-331-567 7274
Mail: steffen.grunewald(at)aei.mpg.de
~~~