[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] Upper bound on Job Run-time



Erik,

That's what was happening.  Thanks very much.

-David

-----Original Message-----
From: Erik Paulson [mailto:epaulson@xxxxxxxxxxx]
Sent: Thursday, October 28, 2004 6:46 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Upper bound on Job Run-time


On Fri, Oct 22, 2004 at 09:12:04AM -0400, David Vestal wrote:
> My Condor grid is made mostly of WinXP machines running 6.6.5.  Playing around with condor_userlog, I noticed that there seems to be a 2-hour upper bound on the running time of my jobs.  I didn't configure this, and I don't want it.  Is this controlled by a Condor config file entry?
> 

There is no upper-bound on runtime. 

What is 2 hours is a default TCP keepalive timer on most operating systems.
It's possible one side (probably the execute side) is crashing or somehow
becoming disconnected and Condor restarting the job. What does the
actual userlog entry say for job 446, and what do the starter logs from
10.81.1.206 and friends say?

-Erik

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
http://lists.cs.wisc.edu/mailman/listinfo/condor-users