[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Can you call condor_reschedule too frequently?



We have a scripted interface to our Condor system that lets users
generate hundreds of jobs with minimal fuss. I'm using the Condor.pm
perl module. We've noticed a serious lag in the time it takes for jobs
submitted from Windows machines to be picked up and processed by the
Linux-based master (we're running 6.7.2) and assigned to Windows
executors in the system. I've observed, for up to 10 minutes after a
jobs have been submitted, the -analyze command reporting that "X
machines match the job but reject it for unknown reasons".  The
situation can immediately be rectified by running condor_reschedule. But
it irks me that this happens even in a completely empty system.

We have:
	NEGOTIATOR_CYCLE=120
	SCHEDD_INTERVAL=120

This was, we assumed, supposed to minimize the lag. We only see this lag
when we submit from Windows.

As a stop-gap measure I'm contemplating calling condor_reschedule from
the user interface script after submitting the jobs and before entering
the Condor monitor loop. Should I worry about stress on the master? Can
anyone comment on why jobs are taking a long time to match when
submitted from windows?

Thanks.

- Ian


--
Ian R. Chesal <ichesal@xxxxxxxxxx>
Senior Software Engineer

Altera Corporation
Toronto Technology Center
Tel: (416) 926-8300