Total online: 1
Is Online User
Main » 2011 » August » 12 » Needleman–Wunsch Algorithm 2
Needleman–Wunsch Algorithm 2
- Traceback = the process of deduction of the best alignment from the traceback matrix.
- The traceback always begins with the last cell to be filled with the score, i.e. the bottom right cell.
- One moves according to the traceback value written in the cell.
- There are three possible moves: diagonally (toward the top-left corner of the matrix), up, or left.
- The traceback is completed when the first, top-left cell of the matrix is reached ("done" cell).
The traceback performed on the completed traceback matrix:
- The alignment is deduced from the values of cells along the traceback path, by taking into account the values of the cell in the traceback matrix:
Sequences are aligned backwards.
- diag - the letters from two sequences are aligned
- left - a gap is introduced in the left sequence
- up - a gap is introduced in the top sequence
- The Traceback Step-By-Step
1. The first cell from the traceback path is "diag" implying that the corresponding letters are aligned:
2. The second cell from the traceback path is also "diag" implying that the corresponding letters are aligned:
3. The third cell from the traceback path is "left" implying the gap in the left sequence (i.e. we stay on the letter A from the left sequence):
4. The fourth cell from the traceback path is also "diag" implying that the corresponding letters are aligned. We consider the letter A again, this time it is aligned with S:
- Compare With The Exhaustive Search
- The best alignment via the Needleman-Wunsch algorithm:
- The exhaustive search:
- -AND score: +1
- A-ND score: +3 the best
- AN-D score: -3
- AND- score: -8
- A Few Observations
- It was much easier to align SEND and AND by the exhaustive search!
- As we consider longer sequences the situation quickly turns against the exhaustive search:
- Two 12 residue sequences would require considering ~ 1 million alignments.
- Two 150 residue sequences would require considering ~ 10ˆ88 alignments (~ 10ˆ78 is the estimated number of atoms in the Universe).
- For two 150 residue sequences the Needleman-Wunsch algorithm requires filling a 150 * 150 matrix.
- The alignment is over the entire length of two sequences: the traceback starts from the lower right corner of the traceback matrix, and completes in the upper left cell of the matrix.
- The Needleman-Wunsch algorithm works in the same way regardless of the length or complexity of sequences, and guarantees to nd the best alignment.
- The Needleman-Wunsch algorithm is appropriate for finding the best alignment of two sequences which are (i) of the similar length; (ii) similar across their entire lengths.
Category: Algorithms |
Views: 979 |
Added by: Ansari
| Rating: 0.0/0||