[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] peaceful node drain and shutdown



To pick up a discussion that left off in mid-July...

A cryptic error crops up in condor_q when the START expression is set to 
UNDEFINED or ERROR when running an analyze:

condor1$ condor_q -better-analyze 264


-- Schedd: condor1 : <138.127.79.182:1567?...
error: bad form
error: problem with ExprToProfile
User priority for pelletm@cade is not available, attempting to analyze 
without it.
---

This appears to arise in boolExpr.cpp from ExprToMultiProfile() - it's 
complaining that the Start value as a Boolean type is not either True or 
False.

So it would appear that the proper approach to refusing additional jobs 
while keeping existing jobs running, while also avoiding having condor_q 
complain to the user about something outside of their purview, is to sever 
START and IS_OWNER as suggested, via START = False along with IS_OWNER = 
False, rather than setting START to UNDEFINED or ERROR.

But in thinking about this further, it seems like this is roughly 
equivalent to draining a machine with an infinite retirement time. Is 
there any way to initiate draining through ClassAd expressions, I wonder, 
along the lines of "START_BACKFILL = True?" That is, START_DRAINING = 
True?

        -Michael Pelletier.
_