Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] problems running jobs
- Date: Fri, 24 Sep 2004 10:48:47 -0500
- From: Andy Wettstein <ajw@xxxxxxxxxxxxxxx>
- Subject: [Condor-users] problems running jobs
Hello,
I have been having problems with condor not accepting jobs, or taking
several minutes before it will run a job. I am just running the "hello
Condor" example binary.
condor_q -analyze says this (my requirements say just run on one machine,
which is why only 4 processors match):
012.007: Run analysis summary. Of 124 machines,
120 are rejected by your job's requirements
0 reject your job because of their own requirements
0 match, but are serving users with a better priority in the pool
4 match, but prefer another specific job despite its worse user-priority
0 match, but will not currently preempt their existing job
0 are available to run your job
16 jobs; 16 idle, 0 running, 0 held
When I submit the jobs I get lines like this in the SchedLog:
9/24 10:18:17 QMGR Connection closed
9/24 10:18:18 DaemonCore: Command received via TCP from host <128.101.222.203:56732>
9/24 10:18:18 DaemonCore: received command 1111 (QMGMT_CMD), calling handler (handle_q)
9/24 10:18:18 condor_read(): Socket closed when trying to read buffer
This is the only errors I can really find for the jobs.
Usually after 10+ minutes the job finally runs, though.