[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] limit on input file transfers ????

Hello Ian,

I had the same problem, sending blast requests on the whole human genome.

1/ I have access to all the computing nodes (kept within our intranet)
  (no comparison with remote Globus machines)

2/ I installed on every condor execute node a /home/gridcache directory and every blast job starts with a 'rsync' command, to be sure that the current blast db kept on NFS is well synchronized with its cached local counterpart. When no change is observed, it costs only some ms, with roughly no network traffic.

3/ the verbous outputs of blast are kept localy on /tmp, ans only the pertinent data are extracted to a table, gzipped and sent back to the condor shared file system (NFS).

Perhaps some ideas can be put into your own scheme.



Dr Ian C. Smith wrote:
--On 30 March 2005 09:13 -0600 Nick LeRoy <nleroy@xxxxxxxxxxx> wrote:

On Wed March 30 2005 7:27 am, Dr Ian C. Smith wrote:

Very quick question:

Is there a Condor-imposed limit on the total size of
input files that can be transfered to an execute host or
a limit on the total time it takes to transfer them.

>         -671628160  -  Run Bytes Received By Job

The entire database is ~ 3.5 GB but the largest file in only ~ 500 MB
and there is enough free disk space on the execution host. The negative
value for the bytes transferred suggests a 32 bit addressing limitation
to me.

What version of Condor are you running?  Prior to Condor 6.7.2, Condor
couldn't handle files larger than 2G, and it's possible that large
collections of transfers could have caused problems, too (I don't
remember  all of the details).  These changes never appeared in the 6.6


6.6.5 on all the execution hosts and the submit host, 6.6.9 on the central
manager. Are the input files ZIP'ed (or similar) before staging as with Globus -
I can imagine this could cause problems with a 2GB limit. Any way around this ?

We are looking at making the databases available through remote file system
access ( Novell or similar ). I suspect that the overhead involved in that
may be more than for actually copying the files over though - particularly if the
file staging time can be amortized over many BLAST runs. I think it should be possible
to transfer individual divisions to separate execution hosts on a divide-and-rule
basis. More programming hassle though :-(


Condor-users mailing list

-- ------------------------------------------------------------ Dr Alain EMPAIN <alain.empain@xxxxxxxxx> <alain@xxxxxxxxxx> Bioinformatics, Molecular Genetics, Fac. Med. Vet., University of LIEGEe, Belgium Bd de Colonster, B43 B-4000 LIEGEe (Sart-Tilman) WORK: +32 4 366 4159 FAX: +32 4 366 4122 HOME: rue des Martyrs,7 B- 4550 Nandrin +32 85 51 2341 GSM: +32 497 70 1764 ------------------------------------------------------------------------------- What's your favorite Linux program? That's like asking a poet what his favorite word is; it's all in how they go together. (Michael Stutz, author of The Linux Cookbook) -------------------------------------------------------------------------------