[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Bulk submission of Jobs

On 7/3/07, Esh Esh <eshforcondor@xxxxxxxxx> wrote:

I am trying to submit 50000 job to condor system (Condor 6.8.2) in loop
using webservices. Every time I run this program I get a error after
submitting 32600 jobs.

 Stack Trace gives:

"java.net.connectionexception : Connection Refused"

Has any body faced this problem earlier?
 Is this specific to condor 6.8.2? Or Is there any limit on the number of
jobs that can be submitted?

32600 sounds suspiciously like you are running out of file handles or
running out of sockets...

Are you holding open a file or socket on each submission?

If you pop a:

Runtime rt = Runtime.getRunTime();

after each submission and it then works then it is likely you are
failing to tidy up your handles. (you could isolate that by seeing how
long it normally takes to do the gc/finalize and just sleeping for
that time instead)

If it doesn't then it gets more complex... there were previous
conversations here about keep-alive on the http connection. That may
be a factor