Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Strange scheduling behavior in 6.8.0

Date: Wed, 16 Aug 2006 10:40:45 -0700
From: "Michael S. Root" <mike@xxxxxxxxxxxxxx>
Subject: [Condor-users] Strange scheduling behavior in 6.8.0

Hi all. I'm having an intermittent problem since upgrading to 6.8.0from 6.6.10 a few weeks ago. Here's the scenario:

We have a pool of about 40 machines running Linux (some FC4, some arestill RH8), all running 6.8.0.

I submit a DAG with about 25 jobs. There are no inter-job dependencies,all the machines match the job criteria, and there are no other jobsrunning in the pool.

Most of the time, all the jobs will be appropriately scheduled and runsimultaneously. However, sometimes, only about 10 of the jobs will getstarted (the exact number varies). DAGman has submitted them all intothe queue, but they aren't matched for some reason. As the first batchof jobs finish, more are submitted, but never more than the initialcount run at once.

When this behavior is occurring, if I run "condor_status", it properlylists all the machines in our pool, including the idle ones that shouldhave been matched to jobs. If I run "condor reschedule -all", it willsend the "Reschedule" command to only those 10 or so machines that areactually running jobs. If I run "condor restart -all", it will send the"Restart" command to all machines in the pool, at which point everythingwill return to normal--all the 'stuck' jobs get properly matched tomachines.


Anyone else see something like this?

-Mike

Follow-Ups:
- Re: [Condor-users] Strange scheduling behavior in 6.8.0
  - From: Erik Paulson

Prev by Date: [Condor-users] jobs submitted through globusrun-ws to condor are always inactive
Next by Date: Re: [Condor-users] Question concerning BOINC backfill and condor
Previous by thread: [Condor-users] jobs submitted through globusrun-ws to condor are always inactive
Next by thread: Re: [Condor-users] Strange scheduling behavior in 6.8.0
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

[Condor-users] Strange scheduling behavior in 6.8.0