[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Ubuntu 14.04 GPU functionality



Thank you for responding Tim.  I copied the more verbose example condor_config 
script over the default one and failed to realize the paths in the sample were 
not the same as the install configuration.  Problem resolved.  Thank you very 
much!

Mike

On Wednesday, January 07, 2015 03:15:25 PM Tim Theisen wrote:
> Hi Michael,
> 
> I installed condor-8.3.2-288596-ubuntu_14.04_amd64.deb on a newly
> installed VM and added the configuration you described and did not have
> the problem that you described.
> 
> Could you send the output from "condor_config_val -dump -verbose" off list?
> 
> ...Tim
> 
> On 01/06/2015 07:10 PM, Michael Murphy wrote:
> > Dear all,
> > 
> > I recently installed condor 8.3.2 via dpkg in a clean Ubuntu 14.04 OS.
> > I added the lines:
> > 
> > use feature : GPUs
> > 
> > GPU_DISCOVERY_EXTRA = -extra
> > 
> > into the condor_config.local file located in /etc/condor.
> > 
> > Problems arise when I start condor via sudo condor service start. All
> > of the daemons on the DAEMONS_LIST in the local file start except for
> > STARTD.
> > 
> > Here is the StarterLog. I'm not sure how to fix this.
> > 
> > 01/06/15 19:04:02 ******************************************************
> > 
> > 01/06/15 19:04:02 ** condor_startd (CONDOR_STARTD) STARTING UP
> > 
> > 01/06/15 19:04:02 ** /usr/sbin/condor_startd
> > 
> > 01/06/15 19:04:02 ** SubsystemInfo: name=STARTD type=STARTD(7)
> > class=DAEMON(1)
> > 
> > 01/06/15 19:04:02 ** Configuration: subsystem:STARTD local:<NONE>
> > class:DAEMON
> > 
> > 01/06/15 19:04:02 ** $CondorVersion: 8.3.2 Dec 16 2014 BuildID: 288596 $
> > 
> > 01/06/15 19:04:02 ** $CondorPlatform: x86_64_Ubuntu14 $
> > 
> > 01/06/15 19:04:02 ** PID = 16066
> > 
> > 01/06/15 19:04:02 ** Log last touched 1/6 18:59:37
> > 
> > 01/06/15 19:04:02 ******************************************************
> > 
> > 01/06/15 19:04:02 Using config source: /etc/condor/condor_config
> > 
> > 01/06/15 19:04:02 Using local config sources:
> > 
> > 01/06/15 19:04:02 /etc/condor/condor_config.local
> > 
> > 01/06/15 19:04:02 config Macros = 88, Sorted = 88, StringBytes = 2901,
> > TablesBytes = 3216
> > 
> > 01/06/15 19:04:02 CLASSAD_CACHING is ENABLED
> > 
> > 01/06/15 19:04:02 Daemon Log is logging: D_ALWAYS D_ERROR
> > 
> > 01/06/15 19:04:02 Daemoncore: Listening at <0.0.0.0:43738> on TCP
> > (ReliSock) and UDP (SafeSock).
> > 
> > 01/06/15 19:04:02 DaemonCore: command socket at <192.168.6.108:43738>
> > 
> > 01/06/15 19:04:02 DaemonCore: private command socket at
> > <192.168.6.108:43738>
> > 
> > 01/06/15 19:04:02 my_popenv failed
> > 
> > 01/06/15 19:04:02 Failed to run hibernation plugin
> > '/usr/libexec/condor_power_state ad'
> > 
> > 01/06/15 19:04:02 VM-gahp server reported an internal error
> > 
> > 01/06/15 19:04:02 VM universe will be tested to check if it is available
> > 
> > 01/06/15 19:04:02 History file rotation is enabled.
> > 
> > 01/06/15 19:04:02 Maximum history file size is: 20971520 bytes
> > 
> > 01/06/15 19:04:02 Number of rotated history files is: 2
> > 
> > 01/06/15 19:04:02 ERROR "Failed to execute local resource 'GPUs'
> > inventory script "/usr/libexec/condor_gpu_discovery -properties
> > -extra"" at line 625 in file
> > /slots/01/dir_53959/userdir/src/condor_startd.V6/ResAttributes.cpp
> > 
> > the condor_gpu_discovery script is located in /usr/lib/condor/libexec/
> > not in /usr/libexec. What variable do I need to set for condor to find
> > this file in it's correct location? The relevant variables from the
> > global config file are as follows:
> > 
> > ##--------------------------------------------------------------------
> > 
> > ## Pathnames:
> > 
> > ##--------------------------------------------------------------------
> > 
> > ## Where have you installed the bin, sbin and lib condor directories?
> > 
> > RELEASE_DIR = /usr
> > 
> > ## Where is the local condor directory for each host?
> > 
> > ## This is where the local config file(s), logs and
> > 
> > ## spool/execute directories are located
> > 
> > LOCAL_DIR = /var/condor
> > 
> > #LOCAL_DIR = $(RELEASE_DIR)/hosts/$(HOSTNAME)
> > 
> > ## Where is the machine-specific local config file for each host?
> > 
> > CONFIG_DIR = /etc/condor
> > 
> > LOCAL_CONFIG_FILE = $(CONFIG_DIR)/condor_config.local
> > 
> > ## Where are optional machine-specific local config files located?
> > 
> > ## Config files are included in lexicographic order.
> > 
> > LOCAL_CONFIG_DIR = $(LOCAL_DIR)/config
> > 
> > ## Blacklist for file processing in the LOCAL_CONFIG_DIR
> > 
> > ## LOCAL_CONFIG_DIR_EXCLUDE_REGEXP =
> > ^((\..*)|(.*~)|(#.*)|(.*\.rpmsave)|(.*\.rpmnew))$
> > 
> > ## If the local config file is not present, is it an error?
> > 
> > ## WARNING: This is a potential security issue.
> > 
> > ## If not specified, the default is True
> > 
> > #REQUIRE_LOCAL_CONFIG_FILE = TRUE
> > 
> > Any help would be greatly appreciated
> > 
> > Michael McInerny Murphy
> > 
> > Engineer
> > 
> > IERUS Technologies, Inc.
> > 
> > 2904 Westcorp Blvd., Suite 210
> > 
> > (256) 319-2026 x 107
> > 
> > 
> > 
> > _______________________________________________
> > HTCondor-users mailing list
> > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
> > a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> > 
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/htcondor-users/