Tool: Blast Protein
Blast Protein searches for similar protein sequences
using a BLAST web service hosted by the
Resource for Biocomputing, Visualization, and Informatics (RBVI).
Any protein structure chain in ChimeraX can be used as the query.
command implementation additionally allows using a sequence
open in ChimeraX as the query.
For hits corresponding to structures in the
Protein Data Bank (PDB), the
structures can be retrieved and automatically superimposed onto the query chain.
Like other tools, this interface can be opened from the Tools menu
- Chain – chain (of a structure open in ChimeraX)
to use as the query sequence
- Database – protein sequence database to search:
- PDB (default) – experimentally determined structures in the
Protein Data Bank (PDB)
- NR – NCBI “non-redundant” database containing
CDS translations + PDB
+ SwissProt +
excluding environmental samples from whole-genome sequencing; this database
is much larger than PDB alone and takes much longer to search
- # Sequences (default 100)
– maximum number of unique sequences to return; more hits than this number
may be obtained because multiple structures or other sequence-database entries
may have the same sequence
– amino acid substitution matrix to use for alignment scoring:
- BLOSUM62 (default)
- Cutoff (default 1e-3) – significance cutoff;
only hits with E-values no larger than the specified value will be returned
Clicking the Blast button initiates the search.
When results are returned, the hits are listed in a table below the
search parameters in the Blast Protein window.
A text-search box allows finding hits that contain terms of interest.
Pull-down menus to the right of the search box allow adjusting the number
of hits shown at a time and which columns of information are included
in the table. Clicking a column header sorts by the values in that column.
Some columns are available for NR sequences
regardless of whether they are in the PDB subset:
- for PDB sequences, the corresponding PDB identifier, including the chain;
clicking it fetches the
structure and superimposes the hit chain onto the query chain
- otherwise, sequence GI number; clicking it opens the web page for
that sequence in the NCBI Protein database
- E-Value – significance value
- Score – alignment score
- Description – a brief description of the sequence
- URL – URL used by ChimeraX for the Name click action
Additional columns of information are available for PDB sequences:
- Date – structure deposition date
- Title – structure title
- Method – method of structure determination
- Resolution – crystallographic resolution
- Authors – structure authors
- PubMed – PubMed identifier of literature reference, if any
- Total atoms
– total number of atoms in the structure (all chains)
- Total residues
– total number of residues in the structure (all chains)
- Chain names
– chain identifiers and descriptions of polymer chains in the structure
- Copies – number of copies of the hit chain in the structure
- Polymers – number of different polymer chains in the structure
(not counting multiple copies of the same sequence)
- Residues – number of residues in the hit chain
- Species – scientific name of source organism
identifier, if any, for hit chain
- Weight – molecular weight of hit chain
- Ligand formulas – chemical formulae of ligand chemical components
- Ligand names – names of ligand chemical components
- Ligand smiles –
SMILES strings of ligand chemical components
- Ligand symbols – residue names of ligand chemical components
- Ligand weights
– molecular weights of ligand chemical components
The Blast Protein tool, including search results
(but not window size or location) is saved in
Basic Local Alignment Search Tool (BLAST).
software is provided by the
NCBI and described in:
Gapped BLAST and PSI-BLAST:
a new generation of protein database search programs.
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ.
Nucleic Acids Res. 1997 Sep 1;25(17):3389-402.
Basic local alignment search tool.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ.
J Mol Biol. 1990 Oct 5;215(3):403-10.
UCSF Resource for Biocomputing, Visualization, and Informatics /