[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] [Condor-users] condor_transfer_data problem on major version switch



On 10/29/2012 03:05 PM, Ian Cottam wrote:
On 26/10/2012 09:37, "Max Fischer"<mfischer@xxxxxxxxxxxxxxxxxxxx>  wrote:

Hi all,

we have recently begun testing remote features in our glidein/condor
pool to allow people from our institute to use condor from any
authorised device (laptops, heterogenous work pools, etc.) without
having to worry about any permanent condor infrastructure there.
Basically we want to supply a drastically cut-down condor installation
via a shared disk to supply only the commands necessary for interfacing
with the remote daemons - as we are still in the testing phase, we are
using a full condor suite (i.e. all bin, sbin, libraries, etc.) at the
moment, though.
Can't help with your detailed, specific (glidein) request, but just to add
that we do similar via a system we call DropAndCompute. Its evolution is
described here<http://www.walkingrandomly.com/?p=3339>.

I'm sure we could provide our scripts if anyone wanted them.
regards

-Ian


Very interesting tool. So far, we have expanded our own in-house terminal tool for job-submission (Grid-Control) to include wrappers for interfacing with the user side of a Condor pool. This is mainly motivated by allowing our users a smooth transition and lots of added functionality for handling grid-specific jobs. Still, the main motivation for introducing Condor/Glideins is ease of use - a good portion of our use-cases does not include any grid-functionality and is more akin to using our pool as an extended, homogeneous computing resource. This seems much better covered with your DropAndCompute system by simple drag&drop (or cp for terminal users?) whereas our approach requires users to handle yet another interface.

Did you configure your submit node in a specific way to reduce the knowledge users must have about the condor architecture? Do you e.g. publish special default arguments to force broken drop-jobs out of the queue or set log, output and error by default? It seems like a natural step to have simpler jobs handled automatically, say a user drops only an executable and a data file and the background script writes out a basic jdl for transfering all files and running the executable plus some basic requirements (e.g. a #!bin/bash executable going to a Linux arch) and then handles it like a regular drop&compute job.

-Max