[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor_master won't start!!



Is there anything interesting in the MasterLog on 8th April when it stopped running? Might want to verify you haven't run out of disk space somewhere.

Rob

Steven Platt wrote:
Hi,

I came back to my cluster after a few days away and found that the collector daemon was down. Here’s the output from a quick attempt to restart the master on the central manager

[steve@queen condor]$ condor_status

CEDAR:6001:Failed to connect to <158.119.147.62:9618>

Error: Couldn't contact the condor_collector on queen.bioinformatics.

...... standard “Extra info:” text ......

[steve@queen condor]$ condor_restart

Can't connect to local master

[steve@queen condor]$ su

Password:

[root@queen condor]# condor_master

[root@queen condor]# ps -ef | grep condor

root     29265 29101  0 14:22 pts/1    00:00:00 grep condor

As you can see there are no condor processes running.

An ls –l of the logs directory before & after the above commands shows that the Masterlog access time has been updated, but the file size is the same as before the commands were issued. The last few lines of the Masterlog are dated 8^th April, which makes me think that it’s being opened and then closed without anything actually happening. All other logs are showing last access at least a week ago.

This was set up by a contractor who’s no longer around and while I don’t think he set any “special” parameters I can’t say for certain.

I’m running CondorVersion 7.0.5 on a Rocks v5.1 cluster of CentOS 5.2 boxes. We’ve got Quill & CondorView plumbed in as well. I can post config files on request.

As it’s on Rocks we’ve also tried running service rocks_condor (stop|start|restart) which has no effect … I think this is just a wrapper for the regular condor_* commands.

I can’t work out why condor_master is not doing anything and I’d really appreciate any advice.

Thanks

Steve

Dr Steven Platt

Bioinformatics Support Coordinator

Statistics, Modelling and Bioinformatics

Health Protection Agency

Centre for Infections

61 Colindale Avenue

London

NW9 5EQ

___www.hpa.org.uk/bioinformatics_ <http://www.hpa.org.uk/bioinformatics>

------------------------------------------------------------------------

* ************************************************************************** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of the HPA, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses, but please re-sweep any attachments before opening or saving. HTTP://www.HPA.org.uk ************************************************************************** *


------------------------------------------------------------------------

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/