[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Blast on Windows



Jess,
You may find some useful help if you look at the hints on the Optena support
pages. Some time back I was pointed to this:
http://condor.optena.com/display/CONDOR/How+To+Increase+Debugging+Messages
which gave me information (on the execution computers) which pointed me to
the problem and its resolution. 
There are other pages for Troubleshooting at:
http://condor.optena.com/display/CONDOR/Troubleshooting
I hope that this helps.
 
Cheers,
 
Phil
__________________________________________________
Philip Crawford, B. Comp. Sc., MIEEE
School of Medical Sciences
The University of NSW
Phone: +61-2-9385 2564
Mobile: +61-419-294 698
Fax: +61-2-9385 1059
Email: p.crawford@xxxxxxxxxxx
__________________________________________________

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Jess Cannata
Sent: Wednesday, 12 April 2006 1:21 PM
To: condor-users@xxxxxxxxxxx
Subject: [Condor-users] Blast on Windows

I am having problems running Blast (non-parallel) jobs on Windows XP 
using Condor 6.6.10. Blast jobs that take less than an hour finish 
without error. Jobs that take more than hour never terminate even though 
the job has "finished," meaning the job no longer uses any CPU time. 
They continue to run forever and continue to allocate RAM. I've included 
a log file of one of the non-terminating jobs which should have finished 
after an hour and a half.

I have run the same jobs on the same machines without Condor and these 
jobs finish without any problem. Other types of jobs (non-Blast) run 
longer than one hour without incident. Has anyone seen this problem 
before? Would any Blast on Windows users be willing to share their 
Condor submission files?

Thanks in advance.

Jess
000 (276.000.000) 04/09 00:32:38 Job submitted from host: <141.161.x.2:9664>
...
001 (276.000.000) 04/09 00:32:44 Job executing on host: <141.161.x.38:1227>
...
006 (276.000.000) 04/09 00:32:52 Image size of job updated: 88744
...
006 (276.000.000) 04/09 00:52:52 Image size of job updated: 881472
...
006 (276.000.000) 04/09 01:12:52 Image size of job updated: 881660
...
010 (276.000.000) 04/09 12:18:48 Job was suspended.
        Number of processes actually suspended: 2
...
011 (276.000.000) 04/09 12:28:50 Job was unsuspended.
...
004 (276.000.000) 04/09 12:28:50 Job was evicted.
        (0) Job was not checkpointed.
                Usr 0 00:28:45, Sys 0 00:00:33  -  Run Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
        0  -  Run Bytes Sent By Job
        2190680  -  Run Bytes Received By Job
_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users