[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Using Dropbox has a Condor pool job submission mechanism



Hi Ian,

I have spent quite a bit of time thinking of something very similar to this.  You are using the file system as the message queue.  It is quite a reasonable approach, in my opinion.

I'm not sure it is necessary to do anything which is "dropbox" specific -- dropbox may just add on additional features (e.g. auto-syncing shared files back to user machines).

You could easily imagine clients that are simply configured with account access details and a path, and all the rest is based on configuration files and job files found in particular directories.

If you have someone at Manchester who is interested in putting together a proof of concept around this, I'd be very happy to contribute.

Do you know Andrew McNab in the Physics department?  He is, in my opinion, quite a visionary thinker when it comes to HPC/grid computing, and has been "in the game" for a long time.  If you can get a bit of his time, he may be able to give you some good ideas/comments.

Ian

Ian Cottam wrote:
Has anyone thought about, or tried, using Dropbox <www.getdropbox.com> 
as a job submission mechanism to a Condor pool?

What I have in mind is the sort of pool that comprises some large number 
of nodes on a private network, with one node -- the submit node -- 
having two network cards: the second being on the global network (i.e. 
the visible Internet). Every user has an account on the submit node, and 
they have to ssh in (or similar) and copy files over; submit the job; 
monitor for it finishing; copy results out.

(If the submit node is also on a shared network file system, also 
mounted on users' machines, the below is invalidated, but I think there 
are many instances where that is not the case.)

The scenario is that all the users use 2GB free Dropbox accounts. (It 
would still work if, like me, they already had [paid] accounts). The 
single job submit node 'owns' a paid account of 50 or 100GB.
When one becomes a user of the pool, a folder is shared, via Dropbox, 
from the job submit node account with the user's account. Users do not 
need login accounts on the submit node.

Then, to submit a Condor job into the private pool, a user would simply 
move or copy the set of files required into the shared folder. A simple 
monitor program on the job submit node would need to look for jobs 
arriving and condor_submit them. The results simply appear back on the 
user's remote machine courtesy of the DropBox sync mechanism.

Clearly, there would be some subtleties of implementation; for example: 
a Condor job submit file would have to have a standardised name; and one 
would need a mechanism for ensuring DropBox had synchronised all the 
files before the job was run; and no doubt others.

Just a thought, but I would be interested in reading people's comments.
regards
-Ian







  

-- 
Ian Stokes-Rees                            W: http://sbgrid.org
ijstokes@xxxxxxxxxxxxxxxxxxx               T: +1 617 432-5608 x75
SBGrid, Harvard Medical School             F: +1 617 432-5600