Turn your schedd debug higher and it will put into the SchedLog
when, if ever, the SYSTEM_PERIODIC_HOLD expression is applied.
I've seen things get messed up if a schedd gets very busy but
that was a long time ago.

Steve Timm

IceCube's OSG submitters have a problem where their SYSTEM_PERIODIC_HOLD expression is applied extremely rarely (days+) and only to a subset of matching jobs. It may be that matching jobs are actually placed on hold only right after condor is restarted (and even then not always).

Running 'condor_q -con "$(condor_config_val system_periodic_hold)"' displays the right jobs, so I don't think I have a typo in the expression. I played with various PERIODIC_EXPR timing settings but couldn't fix the problem.

The only unusual thing about the affected servers I can thing of is that they are OSG submitters, so all their jobs flock to other pools, and most run on glideins.

Does anybody know what could be going wrong or how to debug this?


