ChimeraX docs icon

ChimeraX Tutorial: Coloring by Sequence Conservation

last updated June 2021

This tutorial shows how to color a structure by the conservation in a multiple sequence alignment. In principle, any sequence alignment that ChimeraX can read and can associate with the structure of interest can be used for conservation coloring. However, the required number and diversity of sequences in the alignment depend on what you are trying to illustrate. See also: Color Key, online sources of sequence alignments

H-Ras
GPCR
OMP

The three parts of this tutorial can be done independently.

  1. H-Ras: Adjusting the Coloring
  2. G Protein-Coupled Receptor (GPCR): Sequence-Structure Association
  3. Bacterial Outer-Membrane Protein: Adjusting Conservation Parameters

Start ChimeraX. If you want to use the click-to-execute links in this page, view it in the ChimeraX Browser, e.g. by entering the ChimeraX command:

Commandopen https://www.rbvi.ucsf.edu/chimerax/data/conservation-coloring/conservation-coloring.html

H-Ras: Adjusting the Coloring

Fetch 121p, a structure of H-Ras, from the RCSB Protein DataBank:

Commandopen 121p

Hide water and show the ligand as ball-and-stick:

Commandhide solvent
Commandstyle ligand ball

If you wish, review how to manipulate structures (part of the Binding Sites tutorial).

The sequence-alignment file can be fetched directly from our website with the following command, OR you can download 121p-consurf.aln as plain text to a convenient location on your computer and use the menu: File... Open to open it.

Commandopen https://www.rbvi.ucsf.edu/chimerax/data/conservation-coloring/121p-consurf.aln
Sequence Viewer tabbed
with Model Panel

The Sequence Viewer can be a separate window, or it can be docked into the combined ChimeraX window (details...). The sequence window can be moved/docked/undocked by dragging its top bar, and resized by dragging its edges (if docked) or corners (if undocked). The image shows it docked into the same area as the Models panel so that the two are tabbed. This was done by dragging the sequence window and hovering it over that area. Clicking a tab brings the corresponding panel to the front.

The first sequence in the alignment automatically associates with the structure, as indicated by the colored box around its name: Input_pdb_SEQRES_A. The sequence has this name because the structure was submitted to the ConSurf Server to obtain the alignment.

121p colored by conservation

A histogram above the sequences indicates the conservation per column. These conservation values are automatically assigned to the residues of any associated structures as an attribute named seq_conservation. The default method for calculating conservation is the entropy-based measure from AL2CO (a program included with ChimeraX courtesy of Pei and Grishin).

Try the default coloring, where the entire range of values in the structure is spread across the palette:

Commandcolor byattr seq_conservation

The coloring palette can be specified as a series of colon-separated color names or by any of several built-in palette names:

Commandcolor byattr seq_conservation palette blue:red:yellow
Commandcolor byattr seq_conservation palette cyanmaroon

The full range of values is reported in the Log, in this case, -1.47 to 2.83 if the default method for calculating conservation is used. (Part 3 covers how to change the method.) For more emphasis on extreme values, the palette can be mapped to a narrower range, for example:

Commandcolor byattr seq_conservation palette cyanmaroon range -1.4,1.4

Positions with values below or above the specified range are given the first and last colors of the palette, respectively. The most highly conserved residues (maroon) are in the GTP-binding site.

Notice that cartoon segments at both the N- and C-termini of the protein have retained their original color, tan. Conservation values were not calculated for the corresponding positions in the sequence alignment because they have a high proportion of gaps. You can see this by scrolling the sequence window all the way to the left or right; you may also need to scroll up and down, as the alignment contains many sequences. (Part 3 covers how to change the gap tolerance.)

121p colored by conservation
with novalue shown in yellow

Any residues not associated with a sequence alignment, including solvent, ligands, ions, and other proteins, will also lack a conservation value. Residues without a conservation value can be assigned a color with the novalue option of the command, but to avoid recoloring ligands, etc., an atom specification may be needed to limit its scope of action. For example, in the following command, the coloring is limited to protein only:

Commandcolor byattr seq_conservation protein palette cyanmaroon range -1.4,1.4 novalue yellow

Attributes can also be used in command-line specification, for example, to show and label the residues with conservation values above some cutoff. By default, hide/show commands apply to atomic displays, not the cartoon:

Commandhide protein
Commandshow ::seq_conservation>1.8
Commandlabel ::seq_conservation>1.8
Commandlabel height 1.3

Try the publication preset, with white background and black silhouette outlines:

Commandpreset pub

For publication images, 2D labels (example in the last figure below) are nicer than these “3D” labels that move along with the structure. Remove the 3D labels:

Commandlabel delete
121p molecular surface
colored by conservation

Molecular surfaces can also be colored with color byattribute, as above, or by simply applying the current atom coloring to the surface:

Commandsurface
Commandcolor fromatoms

Use the interactive preset to return to a black background without outlines, and then close the structure and sequence alignment:

Commandpreset inter
Commandclose session

GPCR: Sequence-Structure Association

Fetch 3sn6, a structure of the β2-adrenergic receptor signaling complex:

Commandopen 3sn6

The Log shows some information about the structure: chains A,B, and G are the trimeric G protein, chain N is a nanobody, and R is the receptor, actually a fusion with endolysin. The nonstandard residue is a high-affinity agonist of the receptor.

Use the cylinders preset to show cartoons with tube helices:

Commandpreset cylinders

The receptor is a very dark green. Make it a brighter color and hide the nanobody:

Commandcolor /R yellow
Commandhide /N cartoons

The sequence-alignment file can be fetched directly from our website with the following command, OR you can download 81321.ali as plain text to a convenient location on your computer and use the menu: File... Open to open it.

Commandopen https://www.rbvi.ucsf.edu/chimerax/data/conservation-coloring/81321.ali
Sequence Viewer tabbed
with Toolbar

The image shows the sequence window docked across the top of the overall ChimeraX window, in the same area as the Toolbar.

None of the sequences associated automatically with the receptor structure, as you can see by the absence of a colored box around any sequence name.

To manually associate a sequence with a structure chain, right-click in the sequence window (or Ctrl-click if using a Mac trackpad or single-button mouse) to show its context menu and choose Structure Associations. In the left side of the resulting dialog, choose chain R, and on the right, change from Not associated to Best-matching sequence. After a few seconds, the best match is identified and associated; dismiss the dialog. Alternatively, a command could be used to associate chain R with the best-matching sequence in the alignment:

Commandsequence associate /R 81321.ali

Red boxes in the alignment (scroll horizontally to find them) indicate a couple of mismatches between the sequence in the structure and that in the alignment, but such small differences do not cause any problems for association.

Now that the receptor is associated with the alignment, it can be colored by sequence conservation:

Commandcolor byattr seq_conservation /R & protein range -1.5,3 novalue gray
Commandshow ::seq_conservation>2

Make sticks fatter and color the ligand by heteroatom:

Commandsize stickrad .5
Commandcolor ligand byhet

The stark shadows in full lighting are somewhat distracting. Try soft lighting instead, which uses shadows from 64 directions to approximate ambient occlusion:

Commandlight soft

The tubes look a bit “smudgy,” especially in the single-color chains. Sometimes this smudginess can be reduced by using a greater number of directions for ambient shadows:
Commandlight soft multishadow 512

The tradeoff for this improved appearance of ambient shadows, however, is that using a high number of directions may slow down the response to interactive manipulation. Return to simple lighting, then close the structure and sequence alignment:

Commandlight simple
Commandclose session

Bacterial Outer-Membrane Protein: Adjusting Conservation Parameters

Fetch 1kmo, an outer-membrane protein of E. coli, and hide atoms so that only the cartoon is shown:

Commandopen 1kmo
Commandhide

The sequence-alignment file can be fetched directly from our website with the following command, OR you can download 56935.ali as plain text to a convenient location on your computer and use the menu: File... Open to open it.

Commandopen https://www.rbvi.ucsf.edu/chimerax/data/conservation-coloring/56935.ali
Sequence Viewer tabbed
with Toolbar

The image shows the sequence window docked across the top of the overall ChimeraX window, in the same area as the Toolbar.

The structure automatically associated with sequence d1kmoa-, as indicated by the colored box around its name in the sequence window. Color the structure by the values in the Conservation histogram above the alignment:

Commandcolor byattr seq_conservation

Much of the structure is still the original color (tan), including the interior β-sheet and many loops in the outer β-barrel. Conservation values were not calculated for these residues because they are in columns of the sequence alignment with a high proportion of gaps. The Conservation histogram is blank at these positions. Select residues that lack a conservation value:

Commandselect ~::seq_conservation
We can change the conservation settings to allow more gaps, keeping in mind that there should still be enough residues in a column to provide a reasonably accurate measure of conservation.

Show the Sequence Viewer context menu (right-click or Ctrl-click on the sequence window, depending on platform) and choose Settings. The settings dialog can be left as a separate window or docked within the overall window, as described above for the Sequence Viewer itself. In the settings, switch to the Headers tab, which includes the conservation parameters. Changes in settings are applied immediately, so that the histogram bars will change right away. However, coloring by conservation must be reapplied to reflect the new values.

Increase the allowed Gap fraction to 0.6, then reevaluate which parts are still missing values:

Commandselect ~::seq_conservation

Now much more of the structure is “covered” by conservation values. Clear the selection and reapply the coloring:

Commandselect clear
Commandcolor byattr seq_conservation

The default method for calculating conservation is the entropy-based measure from AL2CO (a program included with ChimeraX courtesy of Pei and Grishin). AL2CO contains other measures (variance-based, sum of pairs) and options besides the Gap fraction. If you like, change any of the AL2CO parameters, see the histogram adjust, and reapply the coloring.

Click Save if you want to save the current settings as preferences. Click Reset to change the settings back to the “factory” defaults.

Color Key and 2D Label

1kmo colored by conservation

As in part 1, you may want to use different colors and/or apply them over a narrower range of values. Also, in ChimeraX 1.2 (May 2021) or newer, a color key can be added with the key option:

Commandcolor byattr seq_conservation palette cyanmaroon range -2,3 novalue silver key true

This draws an initial color key, but also opens the Color Key tool and enables redrawing and moving the key with the mouse (the Adjust key... option in the tool). For the example image, the key was redrawn on the lower left, and the Color Key Labels settings were used to put the numerical labels above instead of below the key. The Adjust key... option can be unchecked to restore the mouse to moving the structure instead of the key.

Other setup for the example image:

Commandwindowsize 600 600
Commandpreset pub
Commandset bg steelblue
Commandlight flat
Commandgraphics sil depth 0.03

A 2D label was added to explain the key:

Command2dlab text "Conservation (AL2CO entropy measure)"

...and the move labels mouse mode . was used to drag it to the desired location.

Although it is usually most convenient to adjust the key settings and the key and label positions interactively as described above, commands could be used instead:

Commandkey labelSide left/top
Commandkey pos 0.06,0.1 size 0.3,0.04
Command2dlabels xpos 0.021 ypos 0.021

What sizes and positions to use for these annotations will depend on the dimensions of the graphics window.


UCSF Resource for Biocomputing, Visualization, and Informatics / June 2021