[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] FS authentication failure



On Thu, Jan 20, 2005 at 01:52:01PM -0600, Zachary Miller wrote:

> On Wed, Jan 19, 2005 at 09:58:50PM -0500, Leslie Groer wrote:
> 
> > I get these authentication failures in the StartLog for all worker
> > nodes in my Scientific Linux 3.0.3 Condor 6.7.2 setup.
> 
> hi leslie,
> 
> FS authentication can only be used to authenticate to a process that
> is running on the same machine as the client, like condor_submit
> authenticating to condor_schedd.
> 
> the reason is that it writes a file in /tmp, and therefore two
> machines, each with their own separate /tmp, cannot use this method.
> you probably just want to turn the authentication settings back to
> their default.
> 
> if you really desire host-to-host authenticationg, you'll want to
> use GSI or KERBEROS.  there is also an FS_REMOTE which can be used
> on a shared filesystem like NFS but i do not recommend this method
> since it usually has problems under even a small load.

The problem with NFS is purely a timing issue.  It's been a while
since I delved into this, but I seem to recall that the order is:

1.) server tells client name of file to create
2.) client creates file and notifies server
3.) server looks for file to test existence and ownership

The failure is at step three because the file creation may not be seen
by the server for several seconds due to NFS delays.

The solution would be for the server to test for existence in a way
that ensures the directory is updated from the NFS server.  The normal
way is to create a dummy file and try to link it to the file that the
client was supposed to create.  The link should fail (because the file
already exists on the NFS server), but the side-effect is that the NFS
directory cache is forced to be updated.

What we have done here is have an NFS directory that is exported from
our central manager to all of the client machines.  When the server
and the NFS server are on the same machine this problem goes away.
The file created by the client will have been created in the local
filesystem and the NFS cache issues go away.

-- 
Daniel K. Forrest	Laboratory for Molecular and
forrest@xxxxxxxxxxxxx	Computational Genomics
(608) 262 - 9479	University of Wisconsin, Madison