  I have never worked with Qpid but it seems from having a quick look at the documentation that it simply provides a high level interface, in your case, to condor. I am amazed as to why native condor commands are not working? Otherwise you might have to look for a wrapper around native condor commands.
Sorry couldn't be of much help to you.
On 21/09/2010, at 5:03 AM, Shahaan Ayyub <shahaan@xxxxxxxxx> wrote:

Hi Allen,
  What does condor_q -better-analyze say for different timestamps, i.e. when some of the jobs are held whilst some of them are still running/completed.


On 21/09/2010, at 3:07 AM, "Berg, Allen" <aberg@xxxxxxxx> wrote:

We have a relatively small condor cluster its fifteen machines with a total of 140 cpus.


We have implemented it using Apache Qpid Daemon is installed on the master node.  This package provides the queue “server”.  It is the facility that provides message queuing to the cluster.  The Apache Qpid API for C++ is installed on each cluster node.


What I am seeing that I have questions about is that when I submit say two jobs very simple just a sleep command for two of the nodes.  The first job will take off and run, the second job will sit there for possibly 20 minutes before it times out.  Within any of the condor logs I am not seeing any errors or any indications of weirdness.  Then if I run a larger test of say 40 jobs to sleep for 5 seconds, I would expect that when I send the 40 jobs in they would all be picked up and run completing in a reasonable amount of time.  What I really see is maybe 20 jobs take off, then 12 will start then maybe 8 and the last few will complete.   How can I find/learn out how the queue actually performing and what can I do to better tune the queue.




