AspGD Help: BLAST Searches


Contents



Description

BLAST stands for Basic Local Alignment Search Tool and was developed by Altschul et al. (1990). It is a very fast search algorithm that is used to separately search protein or DNA sequence databases. BLAST is best used for sequence similarity searching, rather than for motif searching.

A fairly complete on-line guide to BLAST searching can be found at the NCBI BLAST Help Manual. AspGD has a separate help document for the BLAST results page.

BLAST searches offered by AspGD allow users to compare any query sequence to Aspergillus sequence datasets. To search other datasets, NCBI BLAST can be used. To search fungal sequences, use SGD's Fungal BLAST tool.

Using BLAST

AspGD offers these five BLAST programs to accommodate different types of searches:

  1. BLASTN compares a nucleotide query sequence against a nucleotide sequence dataset;
  2. BLASTP compares an amino acid query sequence against a protein sequence dataset;
  3. BLASTX compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence dataset;
  4. TBLASTX compares the six-frame translations of a DNA sequence to the six-frame translations of a nucleotide sequence dataset.
  5. TBLAST compares a protein query sequence against a nucleotide sequence dataset dynamically translated in all six reading frames (both strands);

AspGD also offers a selection of sequence databases that can be searched, depending on the user's requirements.

Sequences can be submitted for a BLAST search in two different ways. The sequence can be uploaded from a local text file with FASTA, GCG, or RAW formatting, or the sequence can be typed or pasted into the Query Sequence window. (Note: The contents of an uploaded sequence file will not be displayed in the Query Sequence window of the search page.)

To use the Upload Local File option:

If a BLAST search results in no, or few, matches, the user may try to increase the number of matches in a number of ways. Going back to the BLAST search page, one can change the database searched, change the protein comparison matrix, or increase the number of alignments shown.

Changing other options can also change the outcome of the BLAST search. The Expect threshold ("E") reflects the number of matches expected to be found by chance. If the statistical significance of a match is greater than the Expect threshold, the match will not be reported. The E threshold default is set to 10. Decreasing the E threshold will increase the stringency of the search: fewer matches will be reported. On the other hand, increasing the E threshold will decrease the stringency of the search and result in more matches being reported. If a query sequence is short (less than about 30 residues), the user will want to adjust the Cutoff Score ("S") to a lower value, which will result in a less stringent criterion for reporting matches. The user can also change the word length (W): BLAST first searches for a perfect match of at least the word length. Once a match is found then it tries to extend the high-scoring segment pair (HSP). The default W value for BLASTN is 11; for all other programs the default is 3. If the word length is less than 11 the query sequence must be less than 5000 bp.

In Aspergillus, nuclear encoded proteins are translated using Translation table 1 (Standard). For BLAST searches where a nucleotide dataset must be translated (TBLASTN and TBLASTX), the AspGD BLAST tool uses Table 1 for translation of the datatsets containing nuclear genes that are available as BLAST target datasets in AspGD. When a nucleotide query sequence is entered by the user, and this sequence is to be used in a BLAST search that requires its translation (BLASTX and TBLASTX), a choice must be made as to which translation table should be used. To handle these query sequences accurately, the "Query translation table" parameter should be set by the user to specify the translation table used to translate the query sequence. By default, the user-supplied query sequence is translated using the same table that is appropriate for the dataset against which it is being searched (i.e., Table 1). However, if, for example, a C. albicans nucleotide sequence were being used in a BLASTX or TBLASTX search, then translation Table 12 (the Alternative Yeast Nuclear table) should be selected. Please see NCBI's Taxonomy browser and Translation Table web page for more information about alternate translation tables.

BLAST queries in which the query or target dataset is translated use the "span" option, described in the WU-BLAST documentation. Use of "span" allows multiple overlapping alignments to be displayed; some query results may display a pair of sequences that are aligned in multiple potential reading frames.

BLAST searches are also subject to filtering. A filter will remove repetitive sequences from a query, so that the results of the BLAST search will be less numerous and, ideally, more informative. For nucleic acid query sequences, the "dust" filter is used as the default. For all other searches, the "seg" filter is the default. You can always use the "Filter options" pull-down menu to select a different filter option or to remove filtering entirely (select "none").

Accessing the BLAST Search Page

BLAST can be accessed by selecting the hypertext link on the menu bar at the top of most AspGD WWW pages, or by using the Sequence Analysis Tools menu on the right-hand sidebar of any AspGD Locus Page, where the user is given the option of a BLAST search page with the sequence already filled in. Go to BLAST


Return to AspGD Send a Message to the AspGD Curators