[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Third-party file staging with Condor-G



condor-users-bounces@xxxxxxxxxxx schrieb am 01/31/2008 09:30:56 PM:

> On Jan 30, 2008, at 10:59 AM, Jan Ploski wrote:
> 
> > The following message thread shows (I think) that third-party file 
> > staging
> > (from/to a GridFTP server different than the submission host) is not
> > implemented in Condor-G and currently not possible:
> >
> > https://lists.cs.wisc.edu/archive/condor-users/2006-July/ 
> > msg00059.shtml
> >
> > The most recent suggestion in that thread was to use a substitution 
> > macro
> > in order to refer to the matched target GT4 host:
> >
> > https://lists.cs.wisc.edu/archive/condor-users/2006-July/ 
> > msg00128.shtml
> >
> > However, I can't see how this alone could solve the staging problem. 
> > The
> > required transferCredentialEndpoint must refer to a piece of data 
> > which
> > 1) has to be created by Condor before the WS-GRAM job is created (this
> > possibly only occurs if you specify input_files/output_files?)
> > 2) has to be known by id to the job submitter - but I didn't find any
> > macro which would expand to this id
> >
> > So I guess I have to repeat the original question: is WS-GRAM-level
> > third-party file staging possible with Condor-G somehow, 
> > nevertheless? If
> > not, are there any plans to implement it? The quickest hack which 
> > would
> > already bring big improvement would be to make it possible to inject 
> > one
> > or more <transfer> elements into the Condor-generated 
> > <transferRequest>
> > element.
> 
> 
> Third-party file staging is something we're interested in add to 
> Condor-G, but there hasn't been enough user demand for us to spend the 
> time implementing it.

Count me as demand ;-) Seriously, I think that this is an important 
feature not because of high demand, but because it would put Condor-G on 
par with other Grid metaschedulers. As is, Condor-G caters mostly to the 
scenarios of existing Condor users (understandably). I think there may be 
a sort of a feedback loop because of it - the feature is not implemented, 
therefore the users don't contemplate taking advantage of third-party 
storage, therefore the feature is not implemented. However, maintaining 
huge files on the client file system is "just not right" in the Grid 
context. For those who wish to use Condor primarily as a Grid 
(meta)scheduler, I see two show-stoppers right now:
1) this third-party transfer feature
2) proxy renewal (and forwarding to the running job) for GT4

> You can fake it by disabling all Condor-managed file transfer in your 
> submit file and adding your own transfer directives like so:
> 
> transfer_executable = false
> transfer_input = false
> transfer_output = false
> transfer_error = false
> remote_initialdir = <somewhere>
> should_transfer_files = NO
> globus_xml = <file staging directives>
> 
> This requires you to do your own credential delegation for the 
> transfers.

OK, so the credential delegation would have to happen, for example, in a 
PRE script of a DAG, which would give me the id that I need to reference 
in globus_xml, right?

Best regards,
Jan Ploski