Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Condor job submission delayed
- Date: Thu, 02 Sep 2004 13:29:10 +0200
- From: Marc Saric <marc.saric@xxxxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] Condor job submission delayed
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ralf Reinhardt wrote:
| What do you mean by very long time?
Around 30 min.
| 300 second delays can occur if the new job started while condor was
| within a 20 seconds frame of the negotiation cycle. you can start the
| job by using condor_reschedule.
| you can reduce the time by lowering the NEGOTIATOR_INTERVAL value, but
| the 20 seconds timeframe is fixed, so for a 60 second interval you have
| a 33% chance that your job must wait up to a minute.
That's clear to me, I did not expect the queued jobs to be executed
within a few seconds, but (recaling my first mail) it was unclear to me,
why sometimes (not allways) jobs don't get executed for 10-30 minutes
while "condor_status" lists a lot of machines (all of them Windows in my
case) as Unclaimed/Idle during that period.
"condor_q -analyze" just tells me, that those machines match, but reject
jobs due to unknown reasons (any chance to evaluate, what those reasons
might be?).
The policy is the UWCS-schema as installed by the windows-installer
(pretty standard, i.e. Available/Unclaimed/Idle if 15 min no
keyboard/mouse input and low cpu-load). On one server (2CPU-WIN2k SP6) I
have configured Condor to use the TESTINGMODE-settings, so at least that
machine should acceppt submissions immediately (but it does not).
Machines show up in condor_status, the queue can be queried with
condor_q and the machines execute batch-jobs ok. The only glitch is,
that I can't submit from Windows machines due to a faulty
"condor_store_cred add" command (see my other mail from 2004-08-31).
Maybe you could point me to the right log-file/debug-setting to dig
deeper into this issues.
| If you have to wait for 30 minutes, it would point to a more serious
| problem in the negotiation between master and clients.
That seems to be the case, I think. But it is not consistent and I don't
know where to look for hints on what might be wrong.
If you want, I could send submit-files and parts (or all) of the
logfiles in question for further analysis by people who actually
understand Condor. :-)
Thanks.
- --
Bye,
Marc Saric
Dr. Marc Saric, Bioinformatik, Proteom Centrum Tübingen,
Auf der Morgenstelle 15, D-72076 Tübingen, Germany,
Tel: +49 (0)7071 29 70557, marc.saric@xxxxxxxxxxxxxxxx
http://www.proteom-centrum-tuebingen.de
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFBNwQFBLD6PjSWyL4RAr/hAJ9a4+9TQWU6PGEZ/O8nwmSP/u+XogCgngft
OnSFdaKCSkPmgGJSn8wwKYI=
=GNsX
-----END PGP SIGNATURE-----