[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Job disconnected, attempting to reconnect



Hi everyone,
I have jest installed COndor 6.8.4 on 2 Windows 2000 machines. I am submitting a simple python script (fac.py that calculates factorials) using the following submit description file:
 
# file : test.condor
# For testing submission of a python script on Condor

Executable = fac.py
Universe        = vanilla
Output  = out.$(cluster)
Error  = err.$(cluster)
Log  = log.$(cluster)
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
Queue
 
I am getting the following messages in my log file:
000 (002.000.000) 05/05 13:47:42 Job submitted from host: <10.2.28.73:2798>
...
001 (002.000.000) 05/05 13:48:28 Job executing on host: <10.2.28.73:2799>
...
022 (002.000.000) 05/05 13:48:28 Job disconnected, attempting to reconnect
    Socket between submit and execute hosts closed unexpectedly
    Trying to reconnect to lab121machine6.icsdomain.uonbi.ac.ke <10.2.28.73:2799>
...
024 (002.000.000) 05/05 13:48:54 Job reconnection failed
    Job not found at execution machine
    Can not reconnect to lab121machine6.icsdomain.uonbi.ac.ke, rescheduling job
 
Please help me understand why the job would fail to reconnect when it's being executed on the same machine it was submitted from. I also have a Personal Condor installation on another machine that gives me similar log messages.
 
By the way, according to the Condor installation manual, after specifying that I am installing a new pool, I am supposed to be asked the number of machines in the pool. That is not happening with any of my installations. Is there a problem with the MSI file I am using?
 
The fac.py looks like this:
def fac(n):
 if n == 0:
  return 1
 if n == 1:
  return 1
 else:
  return fac(n-1)*n
print fac(400)
 
Is the output of the 'print' statement going to be placed in the designated output file or do I have to place it in the file myself within the fac.py code? My assumption is that since python is installed on all the execute machines and it is added to the system path, fac.py will run as an executable. Is my assumption correct?
 
I know its a lot of questions but I really need your help. Thank you.
 


Ahhh...imagining that irresistible "new car" smell?
Check out new cars at Yahoo! Autos.