[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] pool drainoff



In theory that will shut down the machines once all the running jobs
leave. In practice I find if one job takes an incredibly long time to
run new jobs keep getting assigned to the machine and a peaceful point
to shut down is never reached. That's with 6.8.6 (yea, Condor guys, I
know: why don't I tell you about these things? Sometimes it just slips
my mind... :) ).


accept any more new ones.  Also (2) an instruction to let existing
jobs on a schedd complete but not start any more new ones.  (yes
I know the latter could be accomplished with condor_hold -constraint ...)

(2) is precisely the feature I was trying to use, except on a startd,
not a schedd.  If it's not currently possible with 7.0.0, then I'll just
have to continue the tedious practice of watching for specific nodes to
become idle, then shutting condor off.  Otherwise I could just shut
condor off while jobs are running, but I don't like to kill jobs that
have been running for several hours.



condor_off -peaceful <nodename>
on a single node does work for me for this purpose. (and also worked
in 6.8.x and 6.9.x)
If it works right the nodes status should show up in condor_status
as "retiring", all of the VM's both busy and idle.  If it doesn't
work right, that means that the command failed quietly due
to lack of privilege to run it.  If your pool is gsi-authenticated
then you have to have a grid proxy to run that command.
If it's normal host/ip authentication, you have to be
on the collector/negotiator to make the command work, it won't
work if you are logged into the startd node itself (and it won't
give you any errors saying that it didn't work).

Steve




--Mike

I thought I could do this by setting 'START=False' in the
node-specific condor_config.local, followed by
'condor_reconfig -subsystem startd' on the node, but that
doesn't seem to have worked.  The node is still starting new jobs.
Hmm...try:

	condor_reconfig -startd -full

But my gut feeling that is that START = False is going to immediately
vacate the running jobs.

- Ian


Confidentiality Notice.  This message may contain information that is confidential or otherwise protected from disclosure.
If you are not the intended recipient, you are hereby notified that any use, disclosure, dissemination, distribution,
or copying of this message, or any attachments, is strictly prohibited.  If you have received this message in error,
please advise the sender by reply e-mail, and delete the message and any attachments.  Thank you.



_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/