[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Remote Submit Scheds with grid submission to CondorCE



Hi Thomas, all,
I would not advise making it too easy for end users to interact with CEs,
else you are "asking" for them to do silly things like "give me the status
of all jobs and I will filter out mine" every 10 seconds...

Furthermore, users must not be able to see the status of any jobs of
other people, as it would go against the GDPR, sorry...

________________________________________
From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf of Thomas Hartmann [thomas.hartmann@xxxxxxx]
Sent: 16 February 2021 15:01
To: Brian Hua Lin; HTCondor-Users Mail List
Subject: Re: [HTCondor-users] Remote Submit Scheds with grid submission to CondorCE

Hi Brian,

many thanks for the hint!
Talking to the CondorCE schedd directly [1] makes it much easier and is
probably the best solution to avoid any local detours here.

AFAIS, one has to set the Owner to `undefined` for the CE to map the DN
to a local user, or?
I guess, it is not straight forward with plain `condor_q` to query jobs
on the CE, as one would have to know the remote mapped user and there is
no option to forward the X509 proxy for authz in condor_q.

Cheers and thanks,
   Thomas

ps: the general idea would be to use it not for actual user job
submissions to the grid (probably there is hardly any user left using
the grid directly) - but to use it for power users/VOs for debugging and
for our functionality tests to become more generic.


[1]
 >  condor_submit -remote grid-htcondorce0.desy.de -pool
grid-htcondorce0:9619 HTCondorCE.submit

 > cat HTCondorCE.submit
# universe = grid
# grid_resource = condor grid-htcondorce0.desy.de
grid-htcondorce0.desy.de:9619
universe = vanilla
use_x509userproxy = true
X509UserProxy=$ENV(X509_USER_PROXY)
# how to deal with dn:uid mapping?
+Owner = undefined
# Files
executable = mypayload.sh
output = stdout
error = stderr
log = logs
# File transfer behavior
ShouldTransferFiles = YES
WhenToTransferOutput = ON_EXIT
# Resources CE
#+xcount = 1
#+maxMemory = 2000
#+maxWallTime = 10
### +remote_queue = "osg"  # Request the OSG queue #??
queue




On 11/02/2021 18.29, Brian Lin wrote:
> Hi all,
>
> If there's no local Schedd/Gridmanager but your users have access to
> condor_submit, then you can run something like:
>
> condor_submit -remote <CE FQDN> -pool <CE FQDN>:<CE PORT> <vanilla
> universe submit file>
>
> This is effectively what condor_ce_trace [1] and condor_ce_run [2] do.
> However, there are a few caveats to this method of submission:
>
> - HTCondor-CE is designed with pilot jobs in mind, not user jobs. This
> doesn't mean that users can't submit their jobs to a CE, it's just that
> the CE may be very aggressive in job cleanup in ways that are unfriendly
> to end users
> - X.509 proxies will be sent initially but renewals will not be
> forwarded on (assuming x509userproxy/use_x509userproxy are set
> appropriately)
> - Users will not to run commands to fetch their output, making this
> method particularly ineffective for long workflows
>
> - Brian
>
> [1] https://github.com/htcondor/htcondor-ce/blob/master/src/condor_ce_trace
> [2] https://github.com/htcondor/htcondor-ce/blob/master/src/condor_ce_run
>
> On 2/11/21 11:13 AM, Thomas Hartmann wrote:
>> Hi Maarten,
>>
>> maybe I am overthinking it, but AFAIS the problem is the remote submit
>> needing to forward the proxy
>>
>> The user servers are as remote submitters not running a local schedd
>> but only forward the jobs to the actual schedulers. I am not sure, if
>> the X509s are forwarded, or?
>>
>> Cheers,
>>   Thomas
>>
>> On 11/02/2021 18.09, Maarten Litmaath wrote:
>>> Hi Thomas,
>>> could it be as simple as using a JDL like this:
>>>
>>> cmd = ...
>>> output = ...
>>> error = ...
>>> log = ...
>>> +TransferOutput = ""
>>> periodic_hold = ...
>>> periodic_remove = ...
>>> universe = grid
>>> grid_resource = condor htc-ce.some.domain htc-ce.some.domain:9619
>>> use_x509userproxy = true
>>> environment = ...
>>> queue 1
>>>
>>> That is how ALICE let a local HTCondor service deal with all
>>> the intricacies of handling jobs for remote HTCondor CEs.
>>>
>>> Maybe I am missing something?
>>>
>>> ________________________________________
>>> From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf
>>> of Thomas Hartmann [thomas.hartmann@xxxxxxx]
>>> Sent: 11 February 2021 17:46
>>> To: HTCondor-Users Mail List
>>> Subject: [HTCondor-users] Remote Submit Scheds with grid submission
>>> to  CondorCE
>>>
>>> Hi all,
>>>
>>> does somebody has maybe already experiences with combining remote
>>> submission with gridward submission to CondorCEs?
>>>
>>> Thing is, that we have a number of nodes for our users, that act as
>>> remote submitters to the actual schedulers.
>>>
>>> Now it would come handy, not only to submit to the local Condor but also
>>> allow our users to do submissions into the Grid targeting CondorCEs.
>>> Since the Schedd would have to do the authentication through the users'
>>> proxy X509, it would need to be passed forward like kerberos etc.
>>> tickets/tokens.
>>> And are there other caveats, that might need to be considered in such a
>>> setup?
>>>
>>> Maybe somebody has already a working similar setup?
>>>
>>> Cheers,
>>>     Thomas
>>>
>>>
>>
>>
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message tohtcondor-users-request@xxxxxxxxxxx  with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
>