Concepts include: class extensions, refactoring, Big Oh.

The original Global Alignment project provided students with a basic understanding of the methods behind DNA Matrix Alignment. This portion of the project aims to show code improvements by extension of the class Global Alignment to use an improved algorithm, explained here. This model of Matrix alignment is closer to the BLAST algorithm itself.

Global Alignment created a double matrix of SeqA x SeqB size. In the diagram below, most of the cells are pointing towards the center.

However, with a threshold imposed on how wide the diagonal in the center can be, the matrix's size can be reduced greatly, resulting in a memory savings. While the matrix above is an O(n^2) algorithm, the matrix shown the left is a O(dk) where d is the largest sequence's length, and k is the resulting width in the matrix (2 * thresh - 1).

The thin matrix shown here is the basic d x k array derived from the improved matrix. At first, I was a little confused as to how to implement the traceBack() method, since the Cell locations in the d x k matrix no longer directly represent their respective symbols at those locations in the Sequences. After some consideration, it became clear that the information stored in the dxk matrix could translate back to the original matrix by using the threshold and the location on the vertical axis of the dxk matrix. This is shown below.

In other news, I’ve added the progress bar to the iTunes Project. First, the song is selected:

And then, is read. A similar progress bar appears for the writing portion of the program.