[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Job disconnected, attempting to reconnect
- Date: Sat, 5 May 2007 04:17:40 -0700 (PDT)
- From: simon kagwe <simonkagwe@xxxxxxxxx>
- Subject: [Condor-users] Job disconnected, attempting to reconnect
I have jest installed COndor 6.8.4 on 2 Windows 2000 machines. I am submitting a simple python script (fac.py that calculates factorials) using the following submit description file:
# file : test.condor
# For testing submission of a python script on Condor
Executable = fac.py
Universe = vanilla
Output = out.$(cluster)
Error = err.$(cluster)
Log = log.$(cluster)
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
I am getting the following messages in my log file:
000 (002.000.000) 05/05 13:47:42 Job submitted from host: <10.2.28.73:2798>
001 (002.000.000) 05/05 13:48:28 Job executing on host: <10.2.28.73:2799>
022 (002.000.000) 05/05 13:48:28 Job disconnected,
attempting to reconnect
Socket between submit and execute hosts closed unexpectedly
Trying to reconnect to lab121machine6.icsdomain.uonbi.ac.ke <10.2.28.73:2799>
024 (002.000.000) 05/05 13:48:54 Job reconnection failed
Job not found at execution machine
Can not reconnect to lab121machine6.icsdomain.uonbi.ac.ke, rescheduling job
Please help me understand why the job would fail to reconnect when it's being executed on the same machine it was submitted from. I also have a Personal Condor installation on another machine that gives me similar log messages.
By the way, according to the Condor installation manual, after specifying that I am installing a new pool, I am supposed to be asked the number of machines in the pool. That is not happening with any of my installations. Is there a problem with the MSI file
I am using?
The fac.py looks like this:
if n == 0:
if n == 1:
Is the output of the 'print' statement going to be placed in the designated output file or do I have to place it in the file myself within the fac.py code? My assumption is that since python is installed on all the execute machines and it is added to the system path, fac.py will run as an executable. Is my assumption correct?
I know its a lot of questions but I really need your help. Thank you.
Ahhh...imagining that irresistible "new car" smell?
new cars at Yahoo! Autos.