[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] basic condor sched setup for CondorCE submissions



Hi Thomas,

condor_ce_submit is really just a convenience wrapper around condor_submit to submit to the CE schedd *when you are on the same host*. If you're submitting from a remote client, you'll want to use HTCondor-C for job submission [1], for which you have two options:

1. If you have a schedd running on the remote client, you can submit grid universe jobs to the CE schedd (recommended)
2. If you don't have a schedd running on the remote client, you can submit vanilla jobs directly into the CE schedd with 'condor_submit -remote <CE hostname> -pool <CE hostname>:9619 ...`. With this method, you'll have to fetch results manually with `condor_transfer_data`

- Brian

[1] https://htcondor.readthedocs.io/en/stable/grid-computing/grid-universe.html#htcondor-c-job-submission

On 7/16/20 9:49 AM, Thomas Hartmann wrote:
Hi again for another CE question ;)

I am now trying to setup a simple submit hosts for functionality tests
towards a CondorCE.
Since no further functionality is necessary, I aim for a simple setup,
that could be run in a small VM or as container.

The OSG documentation on remote submits to a CondorCE [1] suggest to
install the condor package for a basic sched. Since the packages from
the generic stable/dev repos do not come with a schedd repo, I tried to
define a basal config with just a master and sched daemon running. But
condor_ce_submit failed still trying to connect to the local schedd?

Next I extracted from the OSG condor flavour [2] the basic configs,
where [COLLECTOR, MASTER, NEGOTIATOR, SCHEDD, STARTD] are set up.

Unfortunately, condor_ce_submit/the wrapped condor_submit still fails
with [3] for a submit description like [4]. As far as I can see, a local
schedd is up and running [5], so I wonder how a a schedd setup could
look like to only submit to a remote Condor CE.

Since condor_ce_trace works in running the basic payload, I assume that
the CEs are working - "in principle" ;)

Would it be alternatively reasonable to connect directly to the
collector as condor_ce_trace does to submit a payload?

Cheers and thank,
  Thomas


[1]
https://opensciencegrid.org/docs/compute-element/submit-htcondor-ce/#from-another-host

[2]
https://repo.opensciencegrid.org/osg/3.5/el7/release/x86_64/condor-8.8.9-1.1.osg35.el7.x86_64.rpm
  /etc/condor/config.d/00-osg_default_*

[3]
condor_ce_submit  -debug HTCondorCETest.submit
07/16/20 16:45:55 attempt to connect to <131.169.223.90:9619> failed:
Connection refused (connect errno = 111).

ERROR: Can't find address of local schedd

[4]
cat HTCondorCETest.submit
universe = vanilla
use_x509userproxy = true
+Owner = undefined
grid_resource = condor grid-htcondorce0.desy.de
grid-htcondorce0.desy.de:9619
executable = /opt/misc/tools/CETests/CET.sh
output = stdout
error = stderr
log = logs
ShouldTransferFiles = YES
WhenToTransferOutput = ON_EXIT
queue


[5]
07/16/20 16:42:27 (pid:25782) Setting maximum file descriptors to 4096.
07/16/20 16:42:27 (pid:25782)
******************************************************
07/16/20 16:42:27 (pid:25782) ** condor_schedd (CONDOR_SCHEDD) STARTING UP
07/16/20 16:42:27 (pid:25782) ** /usr/sbin/condor_schedd
07/16/20 16:42:27 (pid:25782) ** SubsystemInfo: name=SCHEDD
type=SCHEDD(5) class=DAEMON(1)
07/16/20 16:42:27 (pid:25782) ** Configuration: subsystem:SCHEDD
local:<NONE> class:DAEMON
07/16/20 16:42:27 (pid:25782) ** $CondorVersion: 8.9.7 May 19 2020
BuildID: 504263 PackageID: 8.9.7-1 $
07/16/20 16:42:27 (pid:25782) ** $CondorPlatform: x86_64_CentOS7 $
07/16/20 16:42:27 (pid:25782) ** PID = 25782
07/16/20 16:42:27 (pid:25782) ** Log last touched 7/16 16:42:11
07/16/20 16:42:27 (pid:25782)
******************************************************
07/16/20 16:42:27 (pid:25782) Using config source: /etc/condor/condor_config
07/16/20 16:42:27 (pid:25782) Using local config sources:
07/16/20 16:42:27 (pid:25782)
/etc/condor/config.d/00-osg_default_daemons.config
07/16/20 16:42:27 (pid:25782)
/etc/condor/config.d/00-osg_default_security.config
07/16/20 16:42:27 (pid:25782)
/etc/condor/config.d/00-restart_peaceful.config
07/16/20 16:42:27 (pid:25782)
/etc/condor/config.d/10-batch_gahp_blahp.config
07/16/20 16:42:27 (pid:25782)    /etc/condor/condor_config.local
07/16/20 16:42:27 (pid:25782) config Macros = 77, Sorted = 77,
StringBytes = 2515, TablesBytes = 2844

This body part will be downloaded on demand.