[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] More BLAST



Hi Dallas, 
Did you get this sorted?
(been on holiday and just catching up on email)

What I've done before on a Windows pool where I don't have blast installed on all the nodes is transfer the executable, the database files, and put the blast parameters in a batch file and run that.

Here's my submit file:

=========================
Executable = blast.bat
Universe = vanilla

Arguments =  :INPUT_FILE: :INPUT_FILE:.out

Transfer_input_files = :FULL_PATH_INPUT_FILE:, C:\blast\condor\test\megablast.exe, C:\blast\condor\test\blast.bat, C:\blast\condor\test\umd3_Chr1.nsq, C:\blast\condor\test\umd3_Chr1.nin, C:\blast\condor\test\umd3_Chr1.nhr

Output = :OUTPUT_FILE:
Error = :ERROR_FILE:
Log = :LOG_FILE:

Should_transfer_files = IF_NEEDED
When_to_transfer_output = ON_EXIT
Transfer_executable = False

Notification = Error
Coresize = 0

Queue
=========================

This is my blast.bat:

megablast.exe -d umd3_Chr1 -i %1 -o %2 -t 21 -W 11 -q -3 -r 2 -G 5 -E 2 -s 56 -N 2 -D 2 -m 8 -F "m D" -U T



If you ensure you transfer all the required files and create error and log files, you should be able to track down any errors.
Let me know if there's anything I haven't made clear.


Russell Smithies 

Bioinformatics Applications Developer 
T +64 3 489 9085 
E  russell.smithies@xxxxxxxxxxxxxxxx 

Invermay  Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 
T  +64 3 489 3809   
F  +64 3 489 9174  
www.agresearch.co.nz 





> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-
> bounces@xxxxxxxxxxx] On Behalf Of Thomas, Dallas
> Sent: Thursday, 24 December 2009 8:30 a.m.
> To: Condor-Users Mail List
> Subject: [Condor-users] More BLAST
> 
> I have a ntfs share mapped to the K:\ of the execute machines and
> /mnt/condor on the central manager and submit machines.  The windows Blast
> executables are located in the /blast/bin directory on the share.  Using
> the following submit script I ran condor_submit
> 
> # Submit script
> Universe = Vanilla
> Requirements = OpSys == "WINNT51" && Arch == "INTEL"
> Executable = /mnt/condor/blast/bin/blastall.exe
> Arguments = -p blastn -d K:\blast\db\nt -i K:\blast\test\sample.fsa -o
> blastout.ncbi
> Should_Transfer_Files = Yes
> When_To_Transfer_Output = ON_EXIT
> Log = BLAST.log
> Queue
> 
> 
> According to condor_q my job ran, however not in the way I hoped that it
> would.  It appears that BLAST did not really actually run.  The entire job
> process took mere seconds and there was no output.  I checked the Log
> Files and this is what they show...
> 
> --
> BLAST.log
> --
> ...
> 000 (025.000.000) 12/23 13:46:24 Job submitted from host:
> <10.115.0.59:40535>
> ...
> 001 (025.000.000) 12/23 13:46:27 Job executing on host: <10.115.2.13:1056>
> ...
> 005 (025.000.000) 12/23 13:46:28 Job terminated.
>     (1) Normal termination (return value 1)
>         Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
>         Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
>         Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
>         Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
>     0  -  Run Bytes Sent By Job
>     2621440  -  Run Bytes Received By Job
>     0  -  Total Bytes Sent By Job
>     2621440  -  Total Bytes Received By Job
> ...
> 
> --
> ShadowLog
> --
> 12/23 13:46:24 ******************************************************
> 12/23 13:46:24 ** condor_shadow (CONDOR_SHADOW) STARTING UP
> 12/23 13:46:24 ** /data/condor/condor-7.4.0/sbin/condor_shadow
> 12/23 13:46:24 ** SubsystemInfo: name=SHADOW type=SHADOW(6)
> class=DAEMON(1)
> 12/23 13:46:24 ** Configuration: subsystem:SHADOW local:<NONE>
> class:DAEMON
> 12/23 13:46:24 ** $CondorVersion: 7.4.0 Oct 31 2009 BuildID: 193173 $
> 12/23 13:46:24 ** $CondorPlatform: X86_64-LINUX_RHEL5 $
> 12/23 13:46:24 ** PID = 17457
> 12/23 13:46:24 ** Log last touched 12/23 13:29:40
> 12/23 13:46:24 ******************************************************
> 12/23 13:46:24 Using config source: /mnt/condor/etc/condor_config
> 12/23 13:46:24 Using local config sources:
> 12/23 13:46:24    /mnt/condor/local/local.ablethr21/condor_config.local
> 12/23 13:46:24 DaemonCore: Command Socket at <10.115.0.59:42175>
> 12/23 13:46:24 Initializing a VANILLA shadow for job 25.0
> 12/23 13:46:24 (25.0) (17457): Request to run on ablethr518096.AGR.GC.CA
> <10.115.2.13:1056> was ACCEPTED
> 12/23 13:46:28 (25.0) (17457): Job 25.0 terminated: exited with status 1
> 12/23 13:46:28 (25.0) (17457): **** condor_shadow (condor_SHADOW) pid
> 17457 EXITING WITH STATUS 100
> 
> --
> SchedLog
> --
> 12/23 13:46:24 (pid:22969) Sent ad to central manager for condor@xxxxxxxxx
> 12/23 13:46:24 (pid:22969) Sent ad to 1 collectors for condor@xxxxxxxxx
> 12/23 13:46:24 (pid:22969) Activity on stashed negotiator socket
> 12/23 13:46:24 (pid:22969) Negotiating for owner: condor@xxxxxxxxx
> 12/23 13:46:24 (pid:22969) Checking consistency running and runnable jobs
> 12/23 13:46:24 (pid:22969) Tables are consistent
> 12/23 13:46:24 (pid:22969) Rebuilt prioritized runnable job list in
> 0.007s.
> 12/23 13:46:24 (pid:22969) Out of jobs - 1 jobs matched, 0 jobs idle,
> flock level = 0
> 12/23 13:46:24 (pid:22969) Completed REQUEST_CLAIM to startd
> ablethr518096.AGR.GC.CA <10.115.2.13:1056> for condor@xxxxxxxxx
> 12/23 13:46:24 (pid:22969) Starting add_shadow_birthdate(25.0)
> 12/23 13:46:24 (pid:22969) Started shadow for job 25.0 on
> ablethr518096.AGR.GC.CA <10.115.2.13:1056> for condor@xxxxxxxxx, (shadow
> pid = 17457)
> 12/23 13:46:28 (pid:22969) Shadow pid 17457 for job 25.0 exited with
> status 100
> 12/23 13:46:28 (pid:22969) Checking consistency running and runnable jobs
> 12/23 13:46:28 (pid:22969) Tables are consistent
> 12/23 13:46:28 (pid:22969) Rebuilt prioritized runnable job list in
> 0.008s.  (Expedited rebuild because no match was found)
> 12/23 13:46:28 (pid:22969) match (ablethr518096.AGR.GC.CA
> <10.115.2.13:1056> for condor@xxxxxxxxx) out of jobs; relinquishing
> 12/23 13:46:28 (pid:22969) Completed RELEASE_CLAIM to startd at
> <10.115.2.13:1056>
> 12/23 13:46:28 (pid:22969) Match record (ablethr518096.AGR.GC.CA
> <10.115.2.13:1056> for condor@xxxxxxxxx, 25.-1) deleted
> 12/23 13:46:28 (pid:22969) Attempting to chown
> '/mnt/condor/local/local.ablethr21/spool/cluster25.proc0.subproc0' from
> 502 to 502.502, but the path was unexpectedly owned by 0
> 12/23 13:46:29 (pid:22969) Sent owner (0 jobs) ad to 1 collectors
> 
> --
> StarterLog
> --
> 12/23 11:46:25 Locale: English_United States.1252
> 12/23 11:46:25 ******************************************************
> 12/23 11:46:25 ** condor_starter (CONDOR_STARTER) STARTING UP
> 12/23 11:46:25 ** C:\condor\bin\condor_starter.exe
> 12/23 11:46:25 ** SubsystemInfo: name=STARTER type=STARTER(8)
> class=DAEMON(1)
> 12/23 11:46:25 ** Configuration: subsystem:STARTER local:<NONE>
> class:DAEMON
> 12/23 11:46:25 ** $CondorVersion: 7.4.0 Oct 31 2009 BuildID: 193173 $
> 12/23 11:46:25 ** $CondorPlatform: INTEL-WINNT50 $
> 12/23 11:46:25 ** PID = 3564
> 12/23 11:46:25 ** Log last touched 12/23 11:29:39
> 12/23 11:46:25 ******************************************************
> 12/23 11:46:25 Using config source: C:\condor\condor_config
> 12/23 11:46:25 Using local config sources:
> 12/23 11:46:25    C:\condor/condor_config.local
> 12/23 11:46:25 DaemonCore: Command Socket at <10.115.2.13:3768>
> 12/23 11:46:25 GLEXEC_JOB not supported on this platform; ignoring
> 12/23 11:46:25 Setting resource limits not implemented!
> 12/23 11:46:25 Communicating with shadow <10.115.0.59:42175>
> 12/23 11:46:25 Submitting machine is "ablethr21.agr.gc.ca"
> 12/23 11:46:25 setting the orig job name in starter
> 12/23 11:46:25 setting the orig job iwd in starter
> 12/23 11:46:26 File transfer completed successfully.
> 12/23 11:46:27 Job 25.0 set to execute immediately
> 12/23 11:46:27 Starting a VANILLA universe job with ID: 25.0
> 12/23 11:46:27 Tracking process family by login "condor-reuse-slot1"
> 12/23 11:46:27 IWD: C:\condor\execute\dir_3564
> 12/23 11:46:27 Output file: C:\condor\execute\dir_3564\blastout.ncbi
> 12/23 11:46:27 Renice expr "10" evaluated to 10
> 12/23 11:46:27 About to exec C:\condor\execute\dir_3564\condor_exec.exe -p
> blastn -d K:\blast\db\nt -i K:\blast\test\sample.fsa -o blastout.ncbi
> 12/23 11:46:27 Create_Process succeeded, pid=1132
> 12/23 11:46:27 Process exited, pid=1132, status=1
> 12/23 11:46:27 Got SIGQUIT.  Performing fast shutdown.
> 12/23 11:46:27 ShutdownFast all jobs.
> 12/23 11:46:28 **** condor_starter (condor_STARTER) pid 3564 EXITING WITH
> STATUS 0
> 
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> ---------------------------------
> 
> Does anyone have any idea what might be wrong here?  Any help would be
> greatly appreciated.
> 
> D
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================