[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Windows Server 2003 condor_submit problem



Hello,

 

I have a condor pool consisting of a Windows XP machine (master) and a Windows 2003 machine (slave), and am unable to get jobs to run on the Windows 2003 machine (they sit in the queue with an ‘Idle’ status until ‘master’ slots become available).

Does anyone know how to fix this problem?

 

Running condor_status from the master indicates all machines are in the pool.

 

In the condor_config files, I have:

 

On the Slave:

DAEMON_LIST = MASTER START

On the Master:

DAEMON_LIST = MASTER COLLECTOR NEGOTIATOR SCHEDD START

 

On the Slave:

COLLECTOR_NAME = My Pool

On the Master:

COLLECTOR_NAME = fbp-test-pool

 

On the Slave:

CONDOR_HOST = <ip address of master>

On the master:

CONDOR_HOST = $(FULL_HOSTNAME)

 

Also on the slave:

        ADD_WINDOWS_FIREWALL_EXCEPTION = FALSE

 

        WINDOWS_FIREWALL_FAILURE_RETRY = 10

 

When I run condor_q on the slave machine, I get error:

 

Error: Can't find address for schedd <my windows 2003 machine>

 

Extra Info: You probably saw this error because the condor_schedd is not

running on the machine you are trying to query. If the condor_schedd is not

running, the Condor system will not be able to find an address and port to

connect to and satisfy this request. Please make sure the Condor daemons are

running and try again.

 

Extra Info: If the condor_schedd is running on the machine you are trying to

query and you still see the error, the most likely cause is that you have

setup a personal Condor, you have not defined SCHEDD_NAME in your

condor_config file, and something is wrong with your SCHEDD_ADDRESS_FILE

setting. You must define either or both of those settings in your config

file, or you must use the -name option to condor_q. Please see the Condor

manual for details on SCHEDD_NAME and SCHEDD_ADDRESS_FILE.

 

I guess this makes sense since schedd is NOT running on the slave.

 

 

Also, when I reverse the roles (make the Windows XP the slave and the Windows 2003 the master) I get the same results (jobs run on the Windows XP machine but not the 2003 machine).

 

Any help would be appreciated.

Thanks,

Diane