[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Job disconnected, attempting to reconnect



Hi everyone,
I have jest installed COndor 6.8.4 on 2 Windows 2000
machines. I am submitting a simple python script
(fac.py that calculates factorials) using the
following submit description file:
 
# file : test.condor
# For testing submission of a python script on Condor

Executable = fac.py
Universe        = vanilla
Output  = out.$(cluster)
Error  = err.$(cluster)
Log  = log.$(cluster)
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
Queue
 
I am getting the following messages in my log file:
000 (002.000.000) 05/05 13:47:42 Job submitted from
host: <10.2.28.73:2798>
...
001 (002.000.000) 05/05 13:48:28 Job executing on
host: <10.2.28.73:2799>
...
022 (002.000.000) 05/05 13:48:28 Job disconnected,
attempting to reconnect
    Socket between submit and execute hosts closed
unexpectedly
    Trying to reconnect to
lab121machine6.icsdomain.uonbi.ac.ke <10.2.28.73:2799>
...
024 (002.000.000) 05/05 13:48:54 Job reconnection
failed
    Job not found at execution machine
    Can not reconnect to
lab121machine6.icsdomain.uonbi.ac.ke, rescheduling job
 
Please help me understand why the job would fail to
reconnect when it's being executed on the same machine
it was submitted from. I also have a Personal Condor
installation on another machine that gives me similar
log messages. 
 
By the way, according to the Condor installation
manual, after specifying that I am installing a new
pool, I am supposed to be asked the number of machines
in the pool. That is not happening with any of my
installations. Is there a problem with the MSI file I
am using?
 
The fac.py looks like this:
def fac(n):
 if n == 0:
  return 1
 if n == 1:
  return 1
 else:
  return fac(n-1)*n
print fac(400)
 
Is the output of the 'print' statement going to be
placed in the designated output file or do I have to
place it in the file myself within the fac.py code? My
assumption is that since python is installed on all
the execute machines and it is added to the system
path, fac.py will run as an executable. Is my
assumption correct?
 
I know its a lot of questions but I really need your
help. Thank you.


 
____________________________________________________________________________________
Finding fabulous fares is fun.  
Let Yahoo! FareChase search your favorite travel sites to find flight and hotel bargains.
http://farechase.yahoo.com/promo-generic-14795097