[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Blast on Windows
- Date: Wed, 12 Apr 2006 13:26:25 -0400
- From: Jess Cannata <jac67@xxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] Blast on Windows
The individual searches take over an hour. We are still in the testing
phase, but we'd like for the individual jobs to take 6-8 hours, though
they amount of time is flexible. We could run 30 minute jobs, but that
wouldn't help us discover why Condor doesn't think the Blast job has
ended. Right now we run the searches on a 100-processor Beowulf cluster,
and the total process takes around ten days and the database is growing
every two weeks. So instead of buying more cluster nodes, we hope to
employ some of our idle Windows machines.
Dr Ian C. Smith wrote:
Does each individual search take over an hour. That seems a huge
amount of time even for something like the whole of EMBL.
I have had big problems with fragmentation on XP for files over
2 GB and blast basically ran forever. I'd try splitting the
blast databases into smaller volumes (100 MB max seemed a
decent compromise). The blast indexing program from NCBI
will do this for you.
Dr Ian C. Smith,
University of Liverpool
Computing Services Department,
Room 4.09, Chadwick Tower
Tel: ++44 (0)151 794 3745
--On 11 April 2006 23:20 -0400 Jess Cannata <jac67@xxxxxxxxxxxxxx> wrote:
I am having problems running Blast (non-parallel) jobs on Windows XP
using Condor 6.6.10. Blast jobs that take less than an hour finish
without error. Jobs that take more than hour never terminate even though
the job has "finished," meaning the job no longer uses any CPU time.
They continue to run forever and continue to allocate RAM. I've included
a log file of one of the non-terminating jobs which should have finished
after an hour and a half.
I have run the same jobs on the same machines without Condor and these
jobs finish without any problem. Other types of jobs (non-Blast) run
longer than one hour without incident. Has anyone seen this problem
before? Would any Blast on Windows users be willing to share their
Condor submission files?
Thanks in advance.
000 (276.000.000) 04/09 00:32:38 Job submitted from host:
001 (276.000.000) 04/09 00:32:44 Job executing on host:
006 (276.000.000) 04/09 00:32:52 Image size of job updated: 88744
006 (276.000.000) 04/09 00:52:52 Image size of job updated: 881472
006 (276.000.000) 04/09 01:12:52 Image size of job updated: 881660
010 (276.000.000) 04/09 12:18:48 Job was suspended.
Number of processes actually suspended: 2
011 (276.000.000) 04/09 12:28:50 Job was unsuspended.
004 (276.000.000) 04/09 12:28:50 Job was evicted.
(0) Job was not checkpointed.
Usr 0 00:28:45, Sys 0 00:00:33 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
0 - Run Bytes Sent By Job
2190680 - Run Bytes Received By Job