[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Negotiator restarted daily - error to fsync or flush



For HA the condor_schedd daemon has to have a shared spool
area.  the collector/negotiator doesn't.
One of the reasons I have not deployed a HA condor_schedd yet
is because I haven't found a good clustered shared file system
that I like, to share the schedd spool area.

Steve


On Fri, 29 Jan 2010, Johnson koil Raj wrote:

Hi Steve,

  Thanks.

If we keep the spool directory locally and if we use HA for Schedd daemon then how the clusterX.procX.subproc0 directories created for each job will be in sync.

by
Johnson


Steven Timm wrote:
You do need the Accountantnew.log whether you are using
replication or not. It is how the negotiator keeps track of
fair share priorities and calculates the priorities you
see when you do condor_userprio.  It also saves some historical
accounting information.

I've tried putting many parts of condor into NFS but never
the spool directory.  I am not sure if that is supposed to work or not.
Steve


On Thu, 28 Jan 2010, Johnson koil Raj wrote:

Hi All,

In our pool Negotiator is configured for HA. I am seeing this error daily or frequently Negotiator dies restarts.
The spool directory is in nfs.

I am not using condor_replication. so this file is needed.
If I did't use Replication am I losing some thing.

1/27 04:02:18 ERROR "fsync of /workingcopy/spool/Accountantnew.log failed, errno = 116" at line 206 in file classad_log.cpp 1/28 04:07:10 ERROR "flush to /workingcopy/spool/Accountantnew.log failed, errno = 116" at line 202 in file classad_log.cpp

Can any one tell why this kind of error occurs. Do I missed any configuration.

by
Johnson

Please do not print this email unless it is absolutely necessary. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/




Please do not print this email unless it is absolutely necessary. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com


--
------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.