[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor head node connection to other node



It sounds as though your config settings are not right. Check the following in a condor_config file on a typical machine (you can check at the command line using condor_config_val – e.g. condor_config_val –v CONDOR_HOST):

·         CONDOR_HOST – is it pointing to your “head node” / admin machine?

·         START – is it set to TRUE (otherwise it will not accept jobs)

·         DAEMONLIST – does it include MASTER SCHEDD STARTD KBDD?

·         UID_DOMAIN – if you have a windows setup it should be set to your company domain

·         ALLOW_OWNER – should be set to your admin machine

·         ALLOW_ADMINISTRATOR – should be set to your admin machine

 

On your admin machine check the following:

·         ALLOW_READ – this should cover the machines that you want to join your pool

·         ALLOW_WRITE – this should cover the machines that you want to join your pool

 

For future reference, what kind of setup do you have:

·         Windows or *nix

·         What kind of job are you trying to run? Hopefully you are just trying to run a script file (shell in *nix or DOS batch in Windows)

 

 

[I hope I have this right – if a more experienced user could review this, then that would be great…]

 

 

 

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Jim Wang
Sent: Thursday, January 09, 2014 6:57 AM
To: Steffen Grunewald; HTCondor-Users Mail List
Subject: Re: [HTCondor-users] condor head node connection to other node

 

Steve:

Thanks for your quick response.

I used vanilla.
condor_status -t only shows head node, no any other node are listed.

If it's condor config issue, is there simple/quick way that we can check and add those nodes to the current setting without re-install the whole system?

 

Thanks

 

Jim

 


On Wed, Jan 08, 2014 at 11:47:40AM -0800, Jim Wang wrote:
> Hi,
>   I am new to condor.
>   The place I am working just setup a HTcondor system, which I am asked to do simple testing to see whether the machine is setup correctly or not.
>  
>   The system has a head node and other 40 node for computation purpose. I got user name and password which allow me to login to head node. When I launched several simple condor job, I found that these jobs were running on head node instead of other 40 nodes. I add "RANK=....." to force the job to run on other node, but it still ran on the head node. I tried to ssh to other node from head node, but failed.

Which universe are you using?
What does "condor_status -t" tell you? Are there any other machines available at all?
If yes, check the Negotiator and Match Logs for more information.

ssh isn't required for Condor to work.



>   My question is: is this an issue that needs to be done in condor configuration step, or it's just network or user account permission issue.



Almost sure: Condor config.

- S

 

 

____________________________________________________________
Electronic mail messages entering and leaving Arup  business
systems are scanned for acceptability of content and viruses