[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] StartLog Error



Hi Todd, 

I meant execution machine :)

Thanks for the long explanation, however c:\condor\execute exists, do you still think that the error occurred since the service was still up?

Thank you,
Dennis.


On Thu, Jan 30, 2014 at 10:29 PM, Todd Tannenbaum <tannenba@xxxxxxxxxxx> wrote:
On 1/30/2014 1:59 PM, Dennis Zheleznyak wrote:
Hi everyone,

Today I encountered an issue when a machine didn't appear in my Condor's
pool, all services were up and didn't see any errors in logs but this one:

I'm running Condor 8.0.5 on Windows 7 Professional 64 Bit, this machine is
a submit only node.

StartLog:
01/30/14 18:48:30 slot1: New machine resource allocated
01/30/14 18:48:30 slot2: New machine resource allocated
01/30/14 18:48:30 slot3: New machine resource allocated
01/30/14 18:48:30 slot4: New machine resource allocated
01/30/14 18:48:30 ERROR "stat exec path (C:\condor\execute), errno: 2 (No
such file or directory)" at line 97 in file
c:\condor\execute\dir_29540\userdir\src\condor_startd.v6\util.cpp

Thank you,
Dennis.



My guess is the HTCondor service was still up because the condor_master.exe daemon was likely still running, and attempting to periodically restart the condor_startd.exe daemon.  The condor_master should have been sending email about the problem to you, assuming you configured the CONDOR_ADMIN and SMTP_SERVER settings in your condor_config file.

The problem is the condor_startd needs c:\condor\execute to exist.

But given that this machine is just a submit node, why run a condor_startd at all?  The condor_startd is the daemon that executes jobs, the condor_schedd is the one that submits jobs.  I suggest removing "STARTD" from the DAEMON_LIST entry in this machine's condor_config[.local] file.

Note that if you do not run a condor_startd, then the machine will not appear when you do "condor_status", but that is just because by default condor_status will show information about startds (i.e. execute nodes).  You could see all your submit nodes by doing
  condor_status -schedd
and/or information about all users that have jobs submitted via
  condor_status -submitters

Hope the above helps,
Todd



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/



--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing   Department of Computer Sciences
HTCondor Technical Lead                1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132                  Madison, WI 53706-1685
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/