BLAST (Basic Local Alignment Search Tool) is a method to ascertain sequence similarity. The program takes a query sequence and searches it against the database selected by user. It aligns a query sequence against the every subject sequence in the database. The results are reported in a form of a ranked list followed by a series of individual sequence alignments, plus various statistics and scores. Every hit in that list is assigned with a similarity score S. Further, that score is analyzed how likely it is to arise by chance. For that purpose so called E-value is calculated for every hit. E-value for the score S tells the expected number of hits of the score S or higher in the database.
For detailed discussion of statistics used in BLAST check the following link.
2. How to use BLAST:
The Advance BLAST page has many parameters which you can adjust, and the outcome of a BLAST search will depend on the parameters you used.
Types of BLAST programs
There are five different blast programs, which can be distinguished by the type of the query sequence (DNA or protein) and the type of the subject database:
BLASTP compares an amino acid query sequence against a protein sequence database;
BLASTN compares a nucleotide query sequence against a nucleotide sequence database;
BLASTX compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database;
TBLASTN compares a protein query sequence against a nucleotide sequence database dynamically translated in all six reading frames (both strands).
TBLASTX compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.
There are many databases to use as subject databases. One of the most commonly used is nr database: collection of "non-redundant" sequences from GenBank and other sequence databanks. For other subject databases available click here.
BLAST accept the sequence in FASTA format (see different formats we discussed last post) or Accession Number (GI number).
Parameters to adjust
EXPECT value: The statistical significance threshold for reporting matches against database sequences; the default value is 10, such that 10 matches are expected to be found merely by chance. If the statistical significance ascribed to a match is greater than the EXPECT threshold, the match will not be reported. Increasing the EXPECT value forces the program to report less isgnificant matches.
FILTER (Low-complexity):Mask off segments of the query sequence that have low compositional complexity (i.e. regions of biased composition, such as short-period repeats)
For BLAST first-time user tutorial click here. For more advanced one click here.