Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [condor-users] running globus jobs without a shared filesystem

Date: 10 Dec 2003 09:45:36 +0000
From: Mark Calleja <mcal00@xxxxxxxxxxxxx>
Subject: Re: [condor-users] running globus jobs without a shared filesystem

On Tue, 2003-12-09 at 23:18, Dan Bradley wrote:
> mcal00@xxxxxxxxxxxxx wrote:
> 
> >There's a line in the message you mention that says:
> >
> >"This isn't much of a problem for output files, as Condor can be told transfer
> >all output files from the job (those created or modified after the job began
> >running)."
> >
> >Well, I could never get it to do it.
> >
> 
> I believe the confusion here is whether you are talking about stage-back 
> of files from the execution node to the gatekeeper node (which is what 
> Alain was talking about), or whether you are talking about stage-back of 
> files to the original point of submission, which is what I think you are 
> talking about, right Mark?

Spot on. Our problem was getting the output files back to the initial
client machine.

> In any case, there is _no_ option in Condor-G to automatically stage 
> back output files to the submission site.  I am right now updating the 
> manual in several places where the different behavior for Globus 
> universe is not made clear.  The transfer-files options Alain was 
> talking about only handle copying back files from the execution machine 
> to the gatekeeper machine.

OK, then I didn't understand his initial point.

> An additional transfer must usually take place between the gatekeeper 
> and the submit machine.  If you know the files in advance, you can 
> specify them explicitly in the transfer_output_files list.  This 
> mechanism (new in Condor 6.5) uses GRAM file staging.  I personally do 
> not know if GRAM file staging is robust enough to handle large data flows.

Fair enough, and this will certainly be useful for some of our jobs when
the output files are known in advance, but some jobs generate files on
the fly. In any case, together with our fork 'n' exec option (which is
only activated if the submitted RSL string contains the relevant
customised tag) I think we can now handle all cases.
 
> Probably the most common technique is to submit a follow-up "fork" job 
> that sends files back via gsiftp.  It sounds like you have a solution 
> similar to this which effectively spawns off a fork job automatically.  
> You may want to investigate how well this scales, if you are likely to 
> be submitting large numbers of jobs, because the gatekeeper tends to be 
> quite heavily burdened.

This may well become a problem for one of our sites, but we'll cross
that bridge when we come to it. Thanks for the heads up!

Cheers,

Mark

-- 
Mark Calleja

Department of Earth Sciences, University of Cambridge
Downing Street, Cambridge CB2 3EQ, UK
Tel. (+44/0) 1223 333408, Fax  (+44/0) 1223 333450
http://www.esc.cam.ac.uk/~mcal00

Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>

References:
- Re: [condor-users] running globus jobs without a shared filesystem (was:
  - From: mcal00
- Re: [condor-users] running globus jobs without a shared filesystem
  - From: Dan Bradley

Prev by Date: Re: [condor-users] Dagman job submission problem
Next by Date: Re: [condor-users] Dagman job submission problem
Previous by thread: Re: [condor-users] running globus jobs without a shared filesystem
Next by thread: [condor-users] Dagman job submission problem
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [condor-users] running globus jobs without a shared filesystem