[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Max Number of Jobs Submission

Thanks Matt.
In order not to run job on central manager, this will do right? Having

DAEMON_LIST                     = MASTER, SCHEDD 

in condor_config file on central manager.

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Matt Hope
Sent: Thursday, March 22, 2007 3:11 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Max Number of Jobs Submission

On 3/22/07, Natarajan, Senthil <senthil@xxxxxxxx> wrote:
> Hi,
> Thanks for the info.
> Here Central manager is the only dedicated submit node. And also runs
> job.

This is a bad idea on many levels including security.

I strongly suggest never allowing jobs to run on the submit machine. a
runaway job could take out the whole farm in quite a few ways
(resource starvation from file handles, disk space, memory the list
goes on and while you can plug each one you don't want to have to)

Secondly jobs running on that machine would have a variety of ways of
exploiting the increased permissions of the box (again controllable
but easy to miss)

Also I suggest (slightly less strongly but still pretty strongly) that
you should consider either splitting your submissions across multiple
machines or going for a High Availability solution (if you can handle
the increased complexity this would provide very solid stability) if
most of your jobs could last several hours or more.
An error on this machine or a forced reboot could total all the cpu
cycles expended across 150 nodes.
Consider this cost and the risk of failure/downtime carefully...

Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
subject: Unsubscribe
You can also unsubscribe by visiting

The archives can be found at either