[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Windows Server 2003 condor_submit problem

Hi Diane:

Windows 2003 and XP are two distinct versions of Windows, so your jobs will
need to account for this:

Requirements = ( $(OpSys) == "WINNT51" || ($(OpSys) == "WINNT52" )

Should solve idle problem you experience when running jobs from the CM (if
you add that to your jobs).

As for the other errors: Try disabling the firewall on the Windows 2003
machine, and see if you "condor_q" commands work on the worker nodes.  If
this works, then you may need to play around with opening some ports in the
firewall.  I see that you are also using ADD_WINDOWS_FIREWALL_EXCEPTION, are
the firewalls disabled on the worker nodes?  Also, is there any particular
reason you're CM's COLLECTOR_NAME differs from that of the worker nodes?
(they should probably all be the same, if they are all part of the same


From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of diane
Sent: Monday, June 23, 2008 2:47 PM
To: condor-users@xxxxxxxxxxx
Subject: [Condor-users] Windows Server 2003 condor_submit problem


I have a condor pool consisting of a Windows XP machine (master) and a
Windows 2003 machine (slave), and am unable to get jobs to run on the
Windows 2003 machine (they sit in the queue with an ?Idle? status until
?master? slots become available).
Does anyone know how to fix this problem?

Running condor_status from the master indicates all machines are in the

In the condor_config files, I have:

On the Slave:
On the Master:

On the Slave:
On the Master:
COLLECTOR_NAME = fbp-test-pool

On the Slave:
CONDOR_HOST = <ip address of master>
On the master:

Also on the slave:


When I run condor_q on the slave machine, I get error:

Error: Can't find address for schedd <my windows 2003 machine>

Extra Info: You probably saw this error because the condor_schedd is not
running on the machine you are trying to query. If the condor_schedd is not
running, the Condor system will not be able to find an address and port to
connect to and satisfy this request. Please make sure the Condor daemons are
running and try again.

Extra Info: If the condor_schedd is running on the machine you are trying to
query and you still see the error, the most likely cause is that you have
setup a personal Condor, you have not defined SCHEDD_NAME in your
condor_config file, and something is wrong with your SCHEDD_ADDRESS_FILE
setting. You must define either or both of those settings in your config
file, or you must use the -name option to condor_q. Please see the Condor
manual for details on SCHEDD_NAME and SCHEDD_ADDRESS_FILE.

I guess this makes sense since schedd is NOT running on the slave.

Also, when I reverse the roles (make the Windows XP the slave and the
Windows 2003 the master) I get the same results (jobs run on the Windows XP
machine but not the 2003 machine).

Any help would be appreciated.