A smith-waterman implementation with traditional vectorization along anti-diagonals to locate a query within a database. Queries lengths are limited by 16-bit arithmetic to (32767-g)/e, where g is the open gap penalty, and e is the extend gap penalty. Database sequences are of arbitrary length.
usage: sw_db [options] <scoring matrix file> <query fasta file> [db fasta file]
-a output alignments (default = top scores only)
-e extend gap penalty (eg), default = 1
1 gap is scored as og + eg
-n number of top scores to report (default = 10)
-o open gap penalty (og), default = 8
-h help
<query fasta file> a single query
[db fasta file] may be specified or read from stdin
example: sw_db -a -e 2 -o 3 pam_1 query nt
example scoring matrix file (comments have a # in the first column,
last column of letters is optional):
# PAM 1
#A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
2 0 -6 0 0 0 -6 0 0 0 0 0 0 0 0 0 0 0 0 -6 0 0 0 0 0 0 A
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 B
-6 0 2 0 0 0 -6 0 0 0 0 0 0 0 0 0 0 0 0 -6 0 0 0 0 0 0 C
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 D
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 E
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 F
-6 0 -6 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 -6 0 0 0 0 0 0 G
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 H
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 I
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 K
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 L
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 M
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 O
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 P
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Q
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 R
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 S
-6 0 -6 0 0 0 -6 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 T
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 U
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 W
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 X
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Y
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Z
Requires the Portable Cray Bioinformatics Library, available at http://cbl.sf.net, for alignment of top scorers. Then simply edit the makefile for the installation directory, and edit the #define NUM_SPE 6 statement in sw_db.c for the number of spe's on your system. Then do $ make $ sudo make install
James Long
jlong@alaska.edu
International Arctic Research Center
Shawn Houston
houston@alaska.edu
Biotechnology Computing Research Group