Traceback = the process of deduction of the best alignment from the traceback matrix.

The traceback always begins with the last cell to be filled with the score, i.e. the bottom right cell.

One moves according to the traceback value written in the cell.

There are three possible moves: diagonally (toward the top-left corner of the matrix), up, or left.

The traceback is completed when the first, top-left cell of the matrix is reached ("done" cell).

The traceback performed on the completed traceback matrix:

The Best Alignment

The alignment is deduced from the values of cells along the traceback path, by taking into account the values of the cell in the traceback matrix:

diag - the letters from two sequences are aligned

left - a gap is introduced in the left sequence

up - a gap is introduced in the top sequence

Sequences are aligned backwards.

The Traceback Step-By-Step

1. The first cell from the traceback path is "diag" implying that the corresponding letters are aligned:

D

D

2. The second cell from the traceback path is also "diag" implying that the corresponding letters are aligned:

ND

ND

3. The third cell from the traceback path is "left" implying the gap in the left sequence (i.e. we stay on the letter A from the left sequence):

END

-ND

4. The fourth cell from the traceback path is also "diag" implying that the corresponding letters are aligned. We consider the letter A again, this time it is aligned with S:

SEND

A-ND

Compare With The Exhaustive Search

The best alignment via the Needleman-Wunsch algorithm:

SEND

A-ND

The exhaustive search:

SEND

-AND score: +1

A-ND score: +3 the best

AN-D score: -3

AND- score: -8

A Few Observations

It was much easier to align SEND and AND by the exhaustive search!

As we consider longer sequences the situation quickly turns against the exhaustive search:

Two 12 residue sequences would require considering ~ 1 million alignments.

Two 150 residue sequences would require considering ~ 10ˆ88 alignments (~ 10ˆ78 is the estimated number of atoms in the Universe).

For two 150 residue sequences the Needleman-Wunsch algorithm requires filling a 150 * 150 matrix.

The Summary

The alignment is over the entire length of two sequences: the traceback starts from the lower right corner of the traceback matrix, and completes in the upper left cell of the matrix.

The Needleman-Wunsch algorithm works in the same way regardless of the length or complexity of sequences, and guarantees to nd the best alignment.

The Needleman-Wunsch algorithm is appropriate for finding the best alignment of two sequences which are (i) of the similar length; (ii) similar across their entire lengths.