[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] "Job disconnected" error



Hi,

I had the same output file, I have resolved this problem. The problem
was I sent a .exe file as executable and I have made a .bat executable
where I call the exe executable. 

>>> simon kagwe <simonkagwe@xxxxxxxxx> 11/05/2007 11:00 >>>
hi,

I am submitting a job from the central manager of a pool of 5 machines
[Windows
2000 machines with Condor 6.8.4, Universe = vanilla]. I keep getting
the "job
disconnected" error even when the job is being executed on the same
machine
where it has been submitted. How can that happen? How can 108 bytes be
received
by the job, yet it is disconnected? Will someone please help me
understand what
is going on? I asked a similar question earlier but the answer I got
didn't
solve the problem. Neither did answers from the archives.

(NB: For some reason, my condor_config.local files on all machines are
empty by
default after installation. Are they supposed to be like that?)

The log file contains:

000 (005.000.000) 05/11 11:39:14 Job submitted from host:
<10.2.28.50:1055>
...
001 (005.000.000) 05/11 12:29:54 Job executing on host:
<10.2.28.50:1056>
...
007 (005.000.000) 05/11 12:29:59 Shadow exception!
	Error from starter on lab121machine7.icsdomain.uonbi.ac.ke:
Create_Process(C:\condor\execute\dir_3308\condor_exec.exe,, ...)
failed
	0  -  Run Bytes Sent By Job
	108  -  Run Bytes Received By Job
...
001 (005.000.000) 05/11 12:30:26 Job executing on host:
<10.2.28.50:1056>
...
022 (005.000.000) 05/11 12:30:27 Job disconnected, attempting to
reconnect
    Socket between submit and execute hosts closed unexpectedly
    Trying to reconnect to lab121machine7.icsdomain.uonbi.ac.ke
<10.2.28.50:1056>
...
024 (005.000.000) 05/11 12:30:36 Job reconnection failed
    Job not found at execution machine
    Can not reconnect to lab121machine7.icsdomain.uonbi.ac.ke,
rescheduling job
...


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users 

The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/ 
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR

-- 
Please note that the views expressed in this e-mail are those of the
sender and do not necessarily represent the views of the Macaulay
Institute. This email and any attachments are confidential and are
intended solely for the use of the recipient(s) to whom they are
addressed. If you are not the intended recipient, you should not read,
copy, disclose or rely on any information contained in this e-mail, and
we would ask you to contact the sender immediately and delete the email
from your system. Thank you.
Macaulay Institute and Associated Companies, Macaulay Drive,
Craigiebuckler, Aberdeen, AB15 8QH.