[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor in a secure academic environment



We set up our Condor pool in an academic environment...so here is our
experience....

> How secure is the data and access on these machines (especially
> confidential & enterprise data) by both the daemons and users?  

We somewhat limited access to the compute hosts by having them trusting only
one submit host, where (selected) users have to log-in first.

Then as far as I can see, the security of data on the compute hosts is as
secure as the owner of the machine wants it to be. We assume that machine
owners will set sensible permissions on their sensitive data, or at least
restrictive enough to forbid the anonymous user under which the jobs are
effectively executing themselves to snoop around and retrieve files.

To effectively limit access to the data stored on the compute node, I think
that using a virtual machine could be a solution, but then you may get a
performance hit...and in my opinion a rather complicated architecture to deal
with...

On Linux maybe it is possible to run the Condor daemons and jobs in a chrooted
environment to create some kind of a sandbox, but I've never seen anyone
mentioning it....

In our case, most of the compute nodes are classrooms computers, where there's
no sensitive data, so the point is moot; but we also offer to install Condor
on staff workstations, where admittedly this issue is very relevant....People
installing Condor on their workstations are warned of the fact that their
computer becomes essentially a multi-user computer, where data must be
protected accordingly.

> Would a virtual machine be required to run Linux, or would it be usable
> if jobs were ported to Windows?

Both options are possible. In the latest Condor versions there's some support
for Virtual Machines environments, where (if I'm not mistaken) it is possible
to start VMs on demand with a small Linux distribution also containing the
job. All computations will then be performed by the VM.
I would suggest to have a look at one of the presentations of the CondorWeek
2008, where the VM approach is described...

In our environment we offer both Windows and Linux (and MacOS X) compute hosts,
and a few cross-compilers on the submit host, so that our users can compile
their applications for all targets.

> Availability policy:
> 
> I'm aware of suspend & preempt if node not idle; is there further policy
> options, such as:
> - only run after business hours

We have implemented a policy where jobs are started only between 8pm and 7am,
plus the whole weekend (during these times, almost nobody use the computers).
This is easily done by modifying the START parameter in condor_config. Here
are the relevant attributes we set up to have such a policy:

IsWorkTime =    ( (ClockDay > 0 && ClockDay < 6) && \
                (ClockMin > 420 && ClockMin < 1200) ) 
IsGridTime = $(IsWorkTime) =!= True
START      = ( (KeyboardIdle > $(StartIdleTime)) \
                    && ( $(CPUIdle) || \
                         (State != "Unclaimed" && State != "Owner")) \ 
                    && $(IsGridTime) == True )

> - only run when owner of machine is logged off 

No idea on this one...but the "KeyboardIdle" parameter in condor_config may
come handy (but I guess that this is not a guarantee that the owner is
completely logged off)

> - network bandwidth throttling (for buildings with slow network
> connections)

I'm not aware of explicit methods to perform network bandwidth throttling
from within Condor. A workaround may be to advertise a custom ClassAd
attribute for the machines in the slow buildings, and then submit to these
hosts only jobs where the submitters explicitely say in the requirements that
their jobs are not network hungry. But I'm sure that there's some other way to
do it...

Pascal