[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] how to troubleshoot the scheduling process



HI Folks,

We have a situation where a user with very high EUP, and a large number of jobs in the queue, is always scheduled ahead of users with much lower (100 times or more) EUP, and thus much high priority. All these jobs are parallel (MPI) jobs, which is likely relevant.

To begin, can anyone suggest a method to diagnose the problem here, and how these evaluations are taking place. My understand from the manual is that user jobs are considered in order of priority (from lower EUP to highest). But the opposite seems to be occurring.

As an example, this user, using 156/200 resources, has a 12 process parallel job complete. His EUP is 156. Immediately a new 12 process job of his is started, despite the fact that there's a user with EUP 0.5 and an 8 node job waiting in the queue.

Thank for any initial insight or input in how to address this.

rob


==========================
Robert E. Parrott, Ph.D. (Phys. '06)
Project Manager., CrimsonGrid Initiative and
Program Manager, CyberInfrastructure Lab
Harvard University Sch. of Eng. and App. Sci.
Maxwell-Dworkin  211,
33 Oxford St.
Cambridge, MA 02138
(617)-495-5045