Smith-Waterman on the Cell Broadband Engine


A smith-waterman implementation with traditional vectorization along anti-diagonals to locate a query within a database. Queries lengths are limited by 16-bit arithmetic to (32767-g)/e, where g is the open gap penalty, and e is the extend gap penalty. Database sequences are of arbitrary length.

Download ver 1.1 (2008-11-25)

Contents

1 Usage

2 Installation


1 Usage

usage: sw_db [options] <scoring matrix file> <query fasta file> [db fasta file]
              -a output alignments (default = top scores only)
              -e extend gap penalty (eg), default = 1
                 1 gap is scored as og + eg
              -n number of top scores to report (default = 10)
              -o open gap penalty (og), default = 8
              -h help 
	      <query fasta file> a single query
	      [db fasta file] may be specified or read from stdin
	      
example: sw_db -a -e 2 -o 3 pam_1 query nt
              
example scoring matrix file (comments have a # in the first column, 
last column of letters is optional):
 
# PAM 1

#A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z
 2  0 -6  0  0  0 -6  0  0  0  0  0  0  0  0  0  0  0  0 -6  0  0  0  0  0  0  A
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  B
-6  0  2  0  0  0 -6  0  0  0  0  0  0  0  0  0  0  0  0 -6  0  0  0  0  0  0  C
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  D
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  E
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  F
-6  0 -6  0  0  0  2  0  0  0  0  0  0  0  0  0  0  0  0 -6  0  0  0  0  0  0  G
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  H
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  I
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  J
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  K
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  L
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  M
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  N
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  O
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  P
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  Q
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  R
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  S
-6  0 -6  0  0  0 -6  0  0  0  0  0  0  0  0  0  0  0  0  2  0  0  0  0  0  0  T
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  U
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  V
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  W
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  X
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  Y
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  Z

2 Installation

Requires the Portable Cray Bioinformatics Library, available at http://cbl.sf.net,
for alignment of top scorers.

Then simply edit the makefile for the installation directory, and edit the 
#define NUM_SPE 6
statement in sw_db.c for the number of spe's on your system. Then do

$ make
$ sudo make install


SourceForge.net Logo

James Long
International Arctic Research Center
University of Alaska Fairbanks
PO Box 757340
Fairbanks, AK 99775
USA
Voice: (907) 474-2689
Fax: (907) 474-2643
jlong@alaska.edu
International Arctic Research Center

Shawn Houston
Biotechnology Computing Research Group
University of Alaska Fairbanks
PO Box 757000
Fairbanks, AK 99775
USA
Voice: (907) 474-5768
Fax: (907) 474-5712
houston@alaska.edu
Biotechnology Computing Research Group