Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] [Condor-users] condor_transfer_data problem on major version switch

Date: Wed, 14 Nov 2012 10:32:07 +0100
From: Max Fischer <mfischer@xxxxxxxxxxxxxxxxxxxx>
Subject: Re: [HTCondor-users] [Condor-users] condor_transfer_data problem on major version switch

On 10/29/2012 03:05 PM, Ian Cottam wrote:

On 26/10/2012 09:37, "Max Fischer"<mfischer@xxxxxxxxxxxxxxxxxxxx>  wrote:

Hi all,

we have recently begun testing remote features in our glidein/condor
pool to allow people from our institute to use condor from any
authorised device (laptops, heterogenous work pools, etc.) without
having to worry about any permanent condor infrastructure there.
Basically we want to supply a drastically cut-down condor installation
via a shared disk to supply only the commands necessary for interfacing
with the remote daemons - as we are still in the testing phase, we are
using a full condor suite (i.e. all bin, sbin, libraries, etc.) at the
moment, though.

Can't help with your detailed, specific (glidein) request, but just to add
that we do similar via a system we call DropAndCompute. Its evolution is
described here<http://www.walkingrandomly.com/?p=3339>.

I'm sure we could provide our scripts if anyone wanted them.
regards

-Ian

Very interesting tool. So far, we have expanded our own in-houseterminal tool for job-submission (Grid-Control) to include wrappers forinterfacing with the user side of a Condor pool. This is mainlymotivated by allowing our users a smooth transition and lots of addedfunctionality for handling grid-specific jobs.Still, the main motivation for introducing Condor/Glideins is ease ofuse - a good portion of our use-cases does not include anygrid-functionality and is more akin to using our pool as an extended,homogeneous computing resource. This seems much better covered with yourDropAndCompute system by simple drag&drop (or cp for terminal users?)whereas our approach requires users to handle yet another interface.

Did you configure your submit node in a specific way to reduce theknowledge users must have about the condor architecture? Do you e.g.publish special default arguments to force broken drop-jobs out of thequeue or set log, output and error by default? It seems like a naturalstep to have simpler jobs handled automatically, say a user drops onlyan executable and a data file and the background script writes out abasic jdl for transfering all files and running the executable plus somebasic requirements (e.g. a #!bin/bash executable going to a Linux arch)and then handles it like a regular drop&compute job.


-Max

Follow-Ups:
- Re: [HTCondor-users] [Condor-users] condor_transfer_data problem on major version switch
  - From: Ian Cottam

Prev by Date: Re: [HTCondor-users] Master server sends wrond SCHEDD IP
Next by Date: [HTCondor-users] badput
Previous by thread: Re: [HTCondor-users] Master server sends wrond SCHEDD IP
Next by thread: Re: [HTCondor-users] [Condor-users] condor_transfer_data problem on major version switch
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] [Condor-users] condor_transfer_data problem on major version switch