[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Master Daemon - can't find address



Uh, is your daemon list entry actually:

DAEMON_LIST = MASTER SCHEDD STARTD KBDD - should already be set correctly

or is the extra - should already be set correctly something you added
as a comment after the fact?

On Thu, Oct 31, 2013 at 8:09 PM, Andrew Mole <Andrew.Mole@xxxxxxxx> wrote:
> To repeat / restate my question, is there anything in the Windows
> installation that makes settings that cannot be fixed by making changes to
> the config files? The IT people who installed Condor on a colleague's
> computer did so without following the instructions I gave them for setting
> it up to be a submit and execute node. I then made changes to the config
> files so that it would be the same as other machines in our pool, but it
> does not seem to start the services to announce it as an execute node, and
> when I try to submit a job from it, it complains about not being able to see
> its own address (see my previous emails for details).
>
> Do I need to make changes to the registry, or some other location other than
> the config files?
>
> Any guidance that anyone can provide would be much appreciated.
>
> Andrew
>
> On 30 Oct, 2013, at 3:21 PM, "Andrew Mole" <Andrew.Mole@xxxxxxxx> wrote:
>
> Hi Ben, Ziliang,
>
>
>
> Any feedback on this? It is still causing me problems.
>
>
>
> Andrew
>
>
>
> From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf
> Of Andrew Mole
> Sent: Monday, October 28, 2013 11:24 AM
> To: HTCondor-Users Mail List
> Subject: Re: [HTCondor-users] Master Daemon - can't find address
>
>
>
> Here it is… Sorry about the delay in responding – I was in Nepal with no
> access to internet…
>
>
>
>
>
> CPUIDLE = ($(NonCondorLoadAvg) <= $(BackgroundLoad))
>
> CREAM_GAHP = $(SBIN)/cream_gahp
>
> CRED_MIN_TIME_LEFT = 120
>
> CRED_STORE_DIR = $(LOCAL_DIR)/cred_dir
>
> CREDD = $(SBIN)/condor_credd.exe
>
> CREDD_ADDRESS_FILE = $(LOG)/.credd_address
>
> CREDD_ARGS = -p $(CREDD_PORT) -f
>
> CREDD_CACHE_LOCALLY = True
>
> CREDD_DEBUG = D_FULLDEBUG
>
> CREDD_HOST = HKGSTR195.asset.general.firm.com
>
> CREDD_LOG = $(LOG)/CredLog
>
> CREDD_PORT = 9620
>
> DAEMON_LIST = MASTER SCHEDD STARTD KBDD - should already be set correctly
>
> DBMSD = $(SBIN)/condor_dbmsd.exe
>
> DBMSD_ARGS = -f
>
> DBMSD_LOG = $(LOG)/DbmsdLog
>
> DEFRAG = $(LIBEXEC)/condor_defrag.exe
>
> DELTACLOUD_GAHP = $(SBIN)/deltacloud_gahp
>
> DETECTED_CORES = 8
>
> DETECTED_MEMORY = 32727
>
> EC2_GAHP = $(SBIN)/ec2_gahp
>
> EC2_GAHP_LOG = /tmp/EC2GahpLog.$(USERNAME)
>
> ENABLE_ADDRESS_REWRITING = true
>
> ENABLE_PERSISTENT_CONFIG = false
>
> ENABLE_RUNTIME_CONFIG = false
>
> EXECUTE = $(LOCAL_DIR)/execute
>
> FILESYSTEM_DOMAIN = $(UID_DOMAIN)
>
> FILETRANSFER_PLUGINS = $(LIBEXEC)/curl_plugin, $(LIBEXEC)/data_plugin
>
> FLOCK_COLLECTOR_HOSTS = $(FLOCK_TO)
>
> FLOCK_FROM =
>
> FLOCK_NEGOTIATOR_HOSTS = $(FLOCK_TO)
>
> FLOCK_TO =
>
> FULL_HOSTNAME = PC2349.asset.general.firm.com
>
> GLITE_LOCATION = $(LIBEXEC)/glite
>
> GRID_MONITOR = $(SBIN)/grid_monitor
>
> GRIDMANAGER = $(SBIN)/condor_gridmanager.exe
>
> GRIDMANAGER_DEBUG =
>
> GRIDMANAGER_JOB_PROBE_INTERVAL = 300
>
> GRIDMANAGER_LOCK = $(LOCK)/GridmanagerLock.$(USERNAME)
>
> GRIDMANAGER_LOG = $(LOG)/GridmanagerLog.$(USERNAME)
>
> GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE = 10
>
> GRIDMANAGER_MAX_SUBMITTED_JOBS_PER_RESOURCE_EC2 = 20
>
> GT2_GAHP = $(SBIN)/gahp_server
>
> HAD_DEBUG =
>
> HAD_LOG = $(LOG)/HADLog
>
> HDFS = $(SBIN)/condor_hdfs.exe
>
> HDFS_BACKUPNODE = hdfs://example.com:50100
>
> HDFS_BACKUPNODE_WEB = example.com:50105
>
> HDFS_DATANODE_DIR = /scratch/tmp/hadoop_data
>
> HDFS_DEBUG =
>
> HDFS_LOG = $(LOG)/HDFSLog
>
> HDFS_NAMENODE = hdfs://example.com:9000
>
> HDFS_NAMENODE_DIR = /tmp/hadoop_name
>
> HDFS_NAMENODE_ROLE = ACTIVE
>
> HDFS_NAMENODE_WEB = example.com:8000
>
> HDFS_NODETYPE = HDFS_DATANODE
>
> HIGHLOAD = 0.5
>
> HISTORY = $(SPOOL)/history
>
> HOSTNAME = PC2349
>
> HOUR = (60 * $(MINUTE))
>
> INCLUDE = $(RELEASE_DIR)/include
>
> INVALID_LOG_FILES = core
>
> IP_ADDRESS = 10.218.22.349
>
> ISMPI = (TARGET.JobUniverse == $(MPI))
>
> ISSTANDARD = (TARGET.JobUniverse == $(STANDARD))
>
> ISVANILLA = (TARGET.JobUniverse == $(VANILLA))
>
> ISVM = (TARGET.JobUniverse == $(VM))
>
> JAVA = C:\PROGRA~2\Java\jre6\bin\java.exe
>
> JAVA_BENCHMARK_TIME = 2
>
> JAVA_CLASSPATH_ARGUMENT = -classpath
>
> JAVA_CLASSPATH_DEFAULT = $(BIN) $(BIN)/scimark2lib.jar .
>
> JAVA_CLASSPATH_SEPARATOR = :
>
> JAVA_EXTRA_ARGUMENTS =
>
> JOB_RENICE_INCREMENT = 10
>
> JOB_ROUTER = $(LIBEXEC)/condor_job_router.exe
>
> JOB_ROUTER_DEBUG =
>
> JOB_ROUTER_LOG = $(LOG)/JobRouterLog
>
> JUSTCPU = ($(CPUBusy) && ($(KeyboardBusy) == False))
>
> KBDD = $(SBIN)/condor_kbdd.exe
>
> KBDD_ADDRESS_FILE = $(LOG)/.kbdd_address
>
> KBDD_DEBUG =
>
> KBDD_LOG = $(LOG)/KbdLog
>
> KEYBOARDBUSY = (KeyboardIdle < $(MINUTE))
>
> KEYBOARDNOTBUSY = ($(KeyboardBusy) == False)
>
> KILL = $(UWCS_KILL)
>
> LASTCKPT = (time() - LastPeriodicCheckpoint)
>
> LEASEMANAGER = $(SBIN)/condor_lease_manager.exe
>
> LEASEMANAGER.CLASSAD_LOG = $(SPOOL)/LeaseManagerState
>
> LEASEMANAGER.DEBUG_ADS = False
>
> LEASEMANAGER.GETADS_INTERVAL = 60
>
> LEASEMANAGER.PRUNE_INTERVAL = 60
>
> LEASEMANAGER.UPDATE_INTERVAL = 300
>
> LEASEMANAGER_DEBUG = D_FULLDEBUG
>
> LEASEMANAGER_LOG = $(LOG)/LeaseManagerLog
>
> LEASEMANGER_ADDRESS_FILE = $(LOG)/.lease_manager_address
>
> LIB = $(RELEASE_DIR)/lib
>
> LIBEXEC = $(BIN)
>
> LIBVIRT_XML_SCRIPT = $(LIBEXEC)/libvirt_simple_script.awk
>
> LOCAL_CONFIG_DIR = $(LOCAL_DIR)/config
>
> LOCAL_CONFIG_DIR_EXCLUDE_REGEXP =
> ^((\..*)|(.*~)|(#.*)|(.*\.rpmsave)|(.*\.rpmnew))$
>
> LOCAL_CONFIG_FILE = $(LOCAL_DIR)/condor_config.local
>
> LOCAL_DIR = $(RELEASE_DIR)
>
> LOCK = $(LOG)
>
> LOG = $(LOCAL_DIR)/log
>
> MACHINEBUSY = ($(CPUBusy) || $(KeyboardBusy))
>
> MACHINEMAXVACATETIME = $(MaxVacateTime)
>
> MAIL = $(BIN)/condor_mail.exe
>
> MAIL_FROM =
>
> MASTER = $(SBIN)/condor_master.exe
>
> MASTER_ADDRESS_FILE = $(LOG)/.master_address
>
> MASTER_DEBUG =
>
> MASTER_LOG = $(LOG)/MasterLog
>
> MAX_C_GAHP_LOG = 1000000
>
> MAX_COLLECTOR_LOG = 1000000
>
> MAX_CREDD_LOG = 4000000
>
> MAX_GRIDMANAGER_LOG = 1000000
>
> MAX_HAD_LOG = 1000000
>
> MAX_HDFS_LOG = 1000000
>
> MAX_JOB_ROUTER_LOG = 1000000
>
> MAX_KBDD_LOG = 1000000
>
> MAX_LEASEMANAGER_LOG = 1000000
>
> MAX_MASTER_LOG = 1000000
>
> MAX_NEGOTIATOR_LOG = 1000000
>
> MAX_NEGOTIATOR_MATCH_LOG = 1000000
>
> MAX_REPLICATION_LOG = 1000000
>
> MAX_ROOSTER_LOG = 1000000
>
> MAX_SCHEDD_LOG = 1000000
>
> MAX_SHADOW_LOG = 1000000
>
> MAX_SHARED_PORT_LOG = 1000000
>
> MAX_STARTD_LOG = 1000000
>
> MAX_STARTER_LOG = 1000000
>
> MAX_STORK_LOG = 4000000
>
> MAX_TRANSFERER_LOG = 1000000
>
> MAX_VM_GAHP_LOG = 1000000
>
> MAXJOBRETIREMENTTIME = $(UWCS_MaxJobRetirementTime)
>
> MAXSUSPENDTIME = 10 * $(MINUTE)
>
> MAXVACATETIME = 10 * $(MINUTE)
>
> MEDIUMJOB = (TARGET.ImageSize >= (15 * 1024) && TARGET.ImageSize < (50 *
> 1024))
>
> MINUTE = 60
>
> MPI = 8
>
> NEGOTIATOR = $(SBIN)/condor_negotiator.exe
>
> NEGOTIATOR_DEBUG = D_MATCH
>
> NEGOTIATOR_LOG = $(LOG)/NegotiatorLog
>
> NEGOTIATOR_MATCH_LOG = $(LOG)/MatchLog
>
> NEGOTIATOR_POST_JOB_RANK = $(UWCS_NEGOTIATOR_POST_JOB_RANK)
>
> NEGOTIATOR_PRE_JOB_RANK = $(UWCS_NEGOTIATOR_PRE_JOB_RANK)
>
> NO_DNS = false
>
> NONCONDORLOADAVG = (LoadAvg - CondorLoadAvg)
>
> NORDUGRID_GAHP = $(SBIN)/nordugrid_gahp
>
> OASYSGSA = "8.6"
>
> OPSYS = WINDOWS
>
> OPSYSANDVER = WINDOWS601
>
> OPSYSLEGACY = WINNT61
>
> OPSYSLONGNAME = Windows 7 SP1
>
> OPSYSMAJORVER = 601
>
> OPSYSNAME = Windows7
>
> OPSYSSHORTNAME = Win7
>
> OPSYSVER = 601
>
> PERIODIC_CHECKPOINT = $(UWCS_PERIODIC_CHECKPOINT)
>
> PID = 7892
>
> PPID = 4816
>
> PREEMPT = FALSE
>
> PREEMPTION_RANK = $(UWCS_PREEMPTION_RANK)
>
> PREEMPTION_REQUIREMENTS = $(UWCS_PREEMPTION_REQUIREMENTS)
>
> PREEN = $(SBIN)/condor_preen.exe
>
> PREEN_ARGS = -m -r
>
> PROCD = $(SBIN)/condor_procd.exe
>
> PROCD_ADDRESS = \\.\pipe\condor_procd_pipe
>
> PROCD_LOG = $(LOG)/ProcLog
>
> PROCD_MAX_SNAPSHOT_INTERVAL = 60
>
> QUEUE_SUPER_USERS = condor, SYSTEM
>
> QUILL = $(SBIN)/condor_quill.exe
>
> QUILL_ADDRESS_FILE = $(LOG)/.quill_address
>
> QUILL_LOG = $(LOG)/QuillLog
>
> REAL_GID = 666
>
> REAL_UID = 666
>
> RELEASE_DIR = D:\condor
>
> REPLICATION_DEBUG =
>
> REPLICATION_LOG = $(LOG)/ReplicationLog
>
> REQUIRE_LOCAL_CONFIG_FILE = FALSE
>
> RESERVED_DISK = 5
>
> ROOSTER = $(LIBEXEC)/condor_rooster.exe
>
> ROOSTER_DEBUG =
>
> ROOSTER_LOG = $(LOG)/RoosterLog
>
> RUNBENCHMARKS = (LastBenchmark == 0 ) || ($(BenchmarkTimer) >= (4 *
> $(HOUR)))
>
> SBIN = $(BIN)
>
> SCHEDD = $(SBIN)/condor_schedd.exe
>
> SCHEDD_ADDRESS_FILE = $(SPOOL)/.schedd_address
>
> SCHEDD_DAEMON_AD_FILE = $(SPOOL)/.schedd_classad
>
> SCHEDD_DEBUG = D_PID
>
> SCHEDD_LOG = $(LOG)/SchedLog
>
> SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD
>
> SEC_CONFIG_AUTHENTICATION = REQUIRED
>
> SEC_CONFIG_ENCRYPTION = REQUIRED
>
> SEC_CONFIG_INTEGRITY = REQUIRED
>
> SEC_CONFIG_NEGOTIATION = REQUIRED
>
> SHADOW = $(SBIN)/condor_shadow.exe
>
> SHADOW_DEBUG =
>
> SHADOW_LIST = SHADOW
>
> SHADOW_LOCK = $(LOCK)/ShadowLock
>
> SHADOW_LOG = $(LOG)/ShadowLog
>
> SHADOW_STANDARD = $(SBIN)/condor_shadow.std.exe
>
> SHARED_PORT = $(LIBEXEC)/condor_shared_port.exe
>
> SHARED_PORT_DEBUG =
>
> SHARED_PORT_LOG = $(LOG)/SharedPortLog
>
> SMALLJOB = (TARGET.ImageSize <  (15 * 1024))
>
> SMTP_SERVER =
>
> SPOOL = $(LOCAL_DIR)/spool
>
> STANDARD = 1
>
> START = TRUE
>
> STARTD = $(SBIN)/condor_startd.exe
>
> STARTD_ADDRESS_FILE = $(LOG)/.startd_address
>
> STARTD_ATTRS = COLLECTOR_HOST_STRING, OasysGSA
>
> STARTD_DEBUG =
>
> STARTD_JOB_EXPRS = ImageSize, ExecutableSize, JobUniverse, NiceUser
>
> STARTD_LOG = $(LOG)/StartLog
>
> STARTER = $(SBIN)/condor_starter.exe
>
> STARTER_ALLOW_RUNAS_OWNER = True
>
> STARTER_LIST = STARTER
>
> STARTER_LOCAL = $(SBIN)/condor_starter.exe
>
> STARTER_LOG = $(LOG)/StarterLog
>
> STARTER_STANDARD = $(SBIN)/condor_starter.std.exe
>
> STARTIDLETIME = 15 * $(MINUTE)
>
> STATETIMER = (time() - EnteredCurrentState)
>
> STORK = $(SBIN)/stork_server
>
> STORK_ADDRESS_FILE = $(LOG)/.stork_address
>
> STORK_ARGS = -p $(STORK_PORT) -f -Serverlog $(STORK_LOG_BASE)
>
> STORK_DEBUG = D_FULLDEBUG
>
> STORK_LOG = $(LOG)/StorkLog
>
> STORK_LOG_BASE = $(LOG)/Stork
>
> STORK_PORT = 9621
>
> SUBSYSTEM = TOOL
>
> SUSPEND = FALSE
>
> TESTINGMODE_CLAIM_WORKLIFE = 1200
>
> TESTINGMODE_CONTINUE = True
>
> TESTINGMODE_KILL = False
>
> TESTINGMODE_PERIODIC_CHECKPOINT = False
>
> TESTINGMODE_PREEMPT = False
>
> TESTINGMODE_PREEMPTION_RANK = 0
>
> TESTINGMODE_PREEMPTION_REQUIREMENTS = False
>
> TESTINGMODE_START = True
>
> TESTINGMODE_SUSPEND = False
>
> TESTINGMODE_WANT_SUSPEND = False
>
> TESTINGMODE_WANT_VACATE = False
>
> TRANSFERER = $(LIBEXEC)/condor_transferer.exe
>
> TRANSFERER_DEBUG =
>
> TRANSFERER_LOG = $(LOG)/TransfererLog
>
> UID_DOMAIN = firm.com
>
> UNAME_ARCH = X86_64
>
> UNAME_OPSYS = WINDOWS
>
> UNICORE_GAHP = $(SBIN)/unicore_gahp
>
> USERNAME = thomas.chandler
>
> UWCS_CONTINUE = ( $(CPUIdle) && ($(ActivityTimer) > 10) && (KeyboardIdle >
> $(ContinueIdleTime)) )
>
> UWCS_KILL = false
>
> UWCS_MAXJOBRETIREMENTTIME = 0
>
> UWCS_NEGOTIATOR_POST_JOB_RANK = (RemoteOwner =?= UNDEFINED) * (KFlops -
> SlotID - 1.0e10*(Offline=?=True))
>
> UWCS_NEGOTIATOR_PRE_JOB_RANK = RemoteOwner =?= UNDEFINED
>
> UWCS_PERIODIC_CHECKPOINT = $(LastCkpt) > (3 * $(HOUR) +
> $RANDOM_INTEGER(-30,30,1) * $(MINUTE) )
>
> UWCS_PREEMPT = ( ((Activity == "Suspended") && ($(ActivityTimer) >
> $(MaxSuspendTime))) || (SUSPEND && (WANT_SUSPEND == False)) )
>
> UWCS_PREEMPTION_RANK = (RemoteUserPrio * 1000000) - TARGET.ImageSize
>
> UWCS_PREEMPTION_REQUIREMENTS = ((SubmitterGroup =?= RemoteGroup) &&
> ($(StateTimer) > (1 * $(HOUR))) && (RemoteUserPrio >
> TARGET.SubmitterUserPrio * 1.2)) || (MY.NiceUser == True)
>
> UWCS_START = ( (KeyboardIdle > $(StartIdleTime)) && ( $(CPUIdle) || (State
> != "Unclaimed" && State != "Owner")) )
>
> UWCS_SUSPEND = ( $(KeyboardBusy) || ( (CpuBusyTime > 2 * $(MINUTE)) &&
> $(ActivationTimer) > 90 ) )
>
> UWCS_WANT_SUSPEND = ( $(SmallJob) || $(KeyboardNotBusy) || $(IsVanilla) ) &&
> ( $(SUSPEND) )
>
> UWCS_WANT_VACATE = ( $(ActivationTimer) > 10 * $(MINUTE) || $(IsVanilla) )
>
> VALID_SPOOL_FILES = job_queue.log, job_queue.log.tmp, history,
> Accountant.log, Accountantnew.log, local_univ_execute, .quillwritepassword,
> .pgpass, .schedd_address, .schedd_classad
>
> VANILLA = 5
>
> VM = 13
>
> VM_GAHP_LOG = NUL
>
> VM_GAHP_SERVER = $(SBIN)/condor_vm-gahp.exe
>
> VM_MAX_NUMBER = $(NUM_CPUS)
>
> VM_MEMORY = 128
>
> VM_NETWORKING = FALSE
>
> VM_NETWORKING_DEFAULT_TYPE = nat
>
> VM_NETWORKING_TYPE = nat
>
> VM_TYPE =
>
> VMWARE_BRIDGE_NETWORKING_TYPE = bridged
>
> VMWARE_LOCAL_SETTINGS_FILE = $(RELEASE_DIR)/condor_vmware_local_settings
>
> VMWARE_NAT_NETWORKING_TYPE = nat
>
> VMWARE_NETWORKING_TYPE = nat
>
> VMWARE_PERL = perl
>
> VMWARE_SCRIPT = $(SBIN)/condor_vm_vmware.exe
>
> WANT_SUSPEND = TRUE
>
> WANT_VACATE = FALSE
>
> WINDOWS_RMDIR = $(SBIN)\condor_rmdir.exe
>
> WINDOWS_SOFTKILL = $(SBIN)/condor_softkill.exe
>
>
>
> C:\>
>
>
>
>
>
>
>
>
>
>
>
> From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf
> Of Ben Cotton
> Sent: 19 October 2013 09:37
> To: HTCondor-Users Mail List
> Subject: Re: [HTCondor-users] Master Daemon - can't find address
>
>
>
>
>
> On Fri, Oct 18, 2013 at 9:33 PM, Andrew Mole <Andrew.Mole@xxxxxxxx> wrote:
>
> Does anyone have any insight on this?
>
>
>
> Can you share the output of `condor_config_val -dump` on that machine?
>
>
>
> --
>
> Ben Cotton
> Senior Support Engineer
> Cycle Computing, LLC
> The Leader in Utility Supercomputing and Cloud HPC Software
>
> ____________________________________________________________
> Electronic mail messages entering and leaving Arup  business
> systems are scanned for acceptability of content and viruses
>
> _______________________________________________
>
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/



-- 
HTCondor Project Windows Developer / NEOS Maintainer