Friday, 2020-02-28
BioInfo Pakistan
Site menu
Section categories
Related Subjects [38]
This category includes brief overview of all related subjects.
Defining BioInformatics [7]
In this section we tried to briefly explain what bioinformatics is ?
Unviersities [30]
This contains information about universities that are offering bioinformatics degree programs.
Resources [24]
Contains information about bioinformatics resources including databases, tools and techniques.
Algorithms [31]
This category includes some of the basic algorithms that are usually used by bioinformaticians.
Our poll
Pakistani Student
Total of answers: 0
Chat Box

Total online: 1
Guests: 1
Users: 0
Home » 2011 » August » 19 » Nussinov Algorithm
10:40 PM
Nussinov Algorithm

Nussinov Algorithm

1. Description:
   Nussinov algorithm is an algorithm to predict possible RNA secondary structure (folding), discovering parts that have complementary sequences. Unlike search algorithms, this method works with the single sequence. It computes the sequence against itself using dynamic programming table, however the matching scores are set so that the letters are treated as "matching" if the corresponding nucleotides would pair. Hence A "matches" U and G "matches" C, but none of the four (AGUC) nucleotides matches itself as two identical nucleotides do not pair. 

2. Algorithm:
   The algorithm is formally written as

   Here w(i,j) = 1 if chars at the positions i and j are complementary (AU, UA, GC, CG). Otherwise this function is 0.

   Another formula set that may be simpler is given in  and referred as Nussinov-Jacobson algorithm:

   With initialization

   Nussinov algorithm does not necessary generates the most stable structure and may have scattered matches that are not biologically reasonable. It is too simple to be accurate but has been a stepping stone for more complex algorithms.

3. Key points:
  • Given an RNA sequence of length L.
    • Find the structure with the most base pairs
  • There are four ways of getting the best structure between position i and j from the best structures of the smaller subsequences:
    1. Add unpaired position i onto best structure for subsequence i+1, j
    2. Add unpaired position j onto best structure for subsequence i, j-1
    3. Add i,j pair onto best structure found for subsequence i+1, j-1
    4. Combine two optimal structures i,k and k+1, j
  • Four Cases
    • The numbers above each case corresponds to the enumeration on the previous slide.
  • Compares a sequence against itself in a dynamic programming matrix.
  • Since structure folds upon itself, it is only necessary to calculate half the matrix.
  • The value of M[i,j] is the number of base pairs in the maximally base-paired structure.
  • Four rules for scoring the structure at a particular point.
4. Pseudocode:
  1. Initialization: fill the main diagonal and the diagonal just below it with zeros
  2. Formally, the scoring matrix, M, is initialized:
    • M[i,i] = 0 for i = 1 to L (main diagonal)
    • M[i,i-1] = 0 for i = 2 to L (diagonal below main diagonal)
  3. Matrix Fill: Starting with all subsequences of length 2, to length L
M[i,j] = max of the following :
  • M[i+1,j] (base i is hanging off by itself)
  • M[i,j-1] (base j is hanging off by itself)
  • M[i+1,j-1] + S(xi, xj) (bases i and j are paired; if xi = complement of xj, then S(xi, xj) = 1; otherwise it is 0)
  • M[i,j] = MAXi<k<j (M[i,k] + M[k+1,j]) (merging two substructures)
 4. Traceback:

5. Pseudoknots:
   The algorithm above detects complementary sequences but tells nothing about how the RNA would fold. The usually expected structure ("knots") implies that one independent matching pair of sequences either completely precedes the other or is fully nested within. To emphasize this fact, RNA folding is described using the dot-bracket notation, marking every pairing nucleotide with ( and its matching counterpart with ). Parenthesis in the resulted expression must be balanced, for instance
  • ....(((...)))......(((((......)))))...... - two independent
  • ....(((........(((.....)))...)))......... - one inside another
  • ....(((........(((.....)))..((((...))))....))) - two inside one
   Using this rule, it is possible to write a wrapping algorithm that in many cases correctly predicts RNA folding. If these rules are violated, it is said that RNA forms "pseudoknots". Pseudoknots do exist in nature, and more complex algorithms are required to discover them.

6. Alternative Sequences:
   A "real world" size RNA may have multiple alternative ways of folding, when either one or another subset of the detected matching sequences are involved. As a rule, RNA folds in the way that is the most "thermodinamicaly stable" - this usually means that the folding case that has the biggest total number of paired nucleotides takes over others.

Category: Algorithms | Views: 9358 | Added by: Ansari | Rating: 0.0/0
Total comments: 2
2 MelbaUrive   [Entry]
free resume templates for illustrator
free business plan for call center
format of effective resume

1 beardswyiaj   [Entry]
锘縖url=] win key sale[/url] yepsww
锘縖url=] Office Professional Plus 2007[/url] bjhy

Name *:
Email *:
Code *:
Log In

«  August 2011  »
Entries archive
Site friends
Copyright MyCorp © 2020
Free website builderuCoz