[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor_submit hangs / condor_q hangs



Yes, the 2 submit hosts are behaving the same way.
CONDOR_HOST is defined in each config to point to the master node.

Yes, the collector negotiator and master are running on the master node.

Thanks!
Adam


Marco Mambelli wrote:
Hi Adam,
are the 2 submit nodes behaving the same way?
did you set CONDOR_HOST to point to your "master node"?
Are you running master, collector and negotiator on the "master node"?

Depending on the configuration you may have logs also in /tmp

scedd is single threaded, if submission is hanging then also condor_q is hanging.

Marco



On Thu, 18 Mar 2010, Steven Timm wrote:

What's the content of SchedLog on the submit machine, sounds
like it could be some kind of authentication issue between
condor_submit, condor_q and the schedd, or else a schedd that's
just totally hosed for some reason.

also you can

export _TOOL_DEBUG=D_ALL ; condor_submit -debug <args>
and you can get some debugging info from condor_submit.

Steve


On Thu, 18 Mar 2010, Adam Yates wrote:

 Hi everyone;

 I'm having a problem with a fresh condor install.
 My setup is this:

 1 master node (master)
 2 interactive nodes (submit only- n00 and n01)
 64 worker nodes (execute only n02-n66)

 Config structure:

master - /home/condor is nfs exported to all nodes. local configs are in
 /home/condor/$HOSTNAME/condor_config.local


 Whenever I use condor_submit, it hangs on "Submitting job.." and then
 eventually times
out, saying that it failed to connect to the local machine on port x. and
 failed to fetch
 ads from the localhost on port x.

Whenever I use condor_q, it will hang but if I give condor_q -global, it
 will return the
 status of my condor pool and show some of the nodes in use.

 The daemon listing in my config on the submit nodes is as such:
 DAEMON_LIST = MASTER, SCHEDD

 There are no errors in the local log (log/*)

Does anybody make sense of this? Please let me know if any more info is
 needed.




_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

--
Adam Yates
Systems Administrator -- Research Infrastructure
Center for Computation and Technology
232 Johnston Hall,
Baton Rouge, LA 70803
W: 225.578.8235    C: 225.663.0218
<yates@xxxxxxxxxxx>