ChimeraX docs icon

Command: blastprotein

blastprotein  sequence  [ database  sequence-database ] [ version  1 | 2 | 3 ] [ matrix  similarity-matrix ] [ cutoff  evalue ] [ maxSeqs  M ] [ log  true | false ] [ name  N ]

The blastprotein command runs a protein sequence similarity search using a BLAST web service hosted by the UCSF Resource for Biocomputing, Visualization, and Informatics (RBVI). It is the command implementation of the Blast Protein tool. One use is to search with a target sequence of unknown structure to find templates for comparative modeling. See also: alphafold search, esmfold search

The query sequence can be given as any of the following:

The protein sequence-database to search can be:

The matrix option indicates which amino acid similarity-matrix to use for alignment scoring (uppercase or lowercase can be used):

The cutoff evalue is the maximum or least significant E-value needed to qualify as a hit (default 1e-3). Results can also be limited with the maxSeqs option (default 100); this is the maximum number of unique sequences to return; more hits than this number may be obtained because multiple structures or other sequence-database entries may have the same sequence.

When the search completes, results are shown in a separate window. Many other types of information can be shown and used to sort the hits: alignment scores, structure resolution, ligand residue names, etc. (details...)

Double-clicking a row with an associated structure (AlphaFold or PDB) fetches the structure, and if a structure chain was used as the query, automatically superimposes the hit and query structures with matchmaker. AlphaFold-predicted structures are colored by confidence 0-100. ESMFold-predicted structures are colored by confidence 0-1. One or more hits can be chosen (highlighted) in the list and the panel's context menu used to fetch and superimpose all of the corresponding structures, or to show their multiple sequence alignment with the query.

The log option indicates whether to also list the results in the Log (default false).

The name option allows supplying a name for a specific set of Blast Protein results, which may be useful when several sets of results are shown at the same time. The name appears in the title bar of the results panel.

UCSF Resource for Biocomputing, Visualization, and Informatics / April 2023