MatchMaker

MatchMaker superimposes structures by first constructing a sequence alignment and then performing a least-squares fit to superimpose the aligned residue pairs. The CA atoms of amino acid residues (C4' atoms of nucleic acid residues) are matched. The sequence alignment is based on some combination of residue identity/similarity and secondary structure correspondence. For an informal introduction, see the Alignments tutorial. See also:

Meng, E.C., Pettersen, E.F., Couch, G.S., Huang, C.C., and Ferrin, T.E. "Tools for integrated sequence-structure analysis with UCSF Chimera." BMC Bioinformatics 7(1):339 (2006).

If it is already known which residue numbers in each structure should be used for matching, the match command is a faster alternative (it does not include a sequence alignment step).

There are several ways to start MatchMaker, a tool in the Structure Comparison category. MatchMaker is also implemented as the command mmaker (or matchmaker).

The Chain pairing options are mutually exclusive and control which chain sequences are used to construct each reference-match alignment:

Specific chain(s) in reference structure with specific chain(s) in match structure - One or more reference chains should be chosen from the list. Individual chains or blocks of chains can be chosen with the left mouse button; Ctrl-click toggles the status of an individual chain. For each reference chain chosen, one chain to be matched should be chosen from the corresponding pulldown menu. If multiple chains are to be matched to the same reference chain, it is necessary to match them in separate steps (by choosing the chain to match and then clicking Apply). A given chain cannot be matched to two different reference chains simultaneously, and chains from the same structure (molecule model) cannot simultaneously serve as a reference chain and a chain to match.
Specific chain in reference structure with best-aligning chain in match structure - One reference chain and one or more structures to match should be chosen. For each structure to be matched, the chain that aligns to the reference chain with the highest sequence alignment score will be used.
Best-aligning pair of chains between reference and match structure (default) - One reference structure and one or more structures to match should be chosen. For each structure to be matched, the reference-match pair of chains with the highest sequence alignment score will be used.

Regardless of which chain(s) in a model to be matched are aligned in sequence to the reference, the entire model will be reoriented.

The sequence Alignment algorithm can be Needleman-Wunsch (global; default) or Smith-Waterman (local). Alignment scoring can use secondary structure information along with residue similarity (more details below). The option to Compute secondary structure assignments is available when secondary structure scoring is turned on. It indicates that helices and strands should first be identified with the ksdssp algorithm, overriding any pre-existing secondary structure assignments. A reason to use this option despite existing secondary structure assignments is that the use of consistent criteria tends to improve MatchMaker results. Pre-existing secondary structure information may have been determined with different methods or parameters for different structures. Ksdssp parameter defaults can be adjusted with the compute SS dialog (opened from the Model Panel).

When Show alignment(s) in Multalign Viewer is checked, each pairwise reference-match sequence alignment will be shown in a separate window. (Note that Match -> Align can be used to construct a multiple sequence alignment from a multiple superposition.)

Iterate by pruning long atom pairs until no pair exceeds [x] angstroms refers to an iterative fitting procedure. The sequence alignment is not changed, but residue pairs in the alignment can be removed from the "match list" used to superimpose the structures. In each cycle, atom pairs (CA-CA for amino acids, C4'-C4' for nucleic acids) are removed from the match list and the remaining pairs are fitted, until no matched pair is more than x angstroms apart (x=2.0 by default). The atom pairs removed are either the 10% farthest apart of all pairs or the 50% farthest apart of all pairs exceeding the cutoff, whichever is the lesser number of pairs. The result is that the best-matching "core" regions are maximally superimposed; conformationally dissimilar regions such as flexible loops are not included in the final fit, even though they may be aligned in the sequence alignment.

OK performs the aligning/matching and dismisses the dialog, whereas Apply performs the operations without dismissing the dialog. Cancel simply dismisses the dialog. Help opens this manual page in a browser window. Sequence alignment scores, parameter values, and final match RMSDs are reported in the Reply Log.

ALIGNMENT SCORING

Sequence alignment scores can include a residue identity/similarity term, a secondary structure term, and gap penalties.

Residue identity/similarity. This term is based on any of several common substitution matrices.

Secondary structure correspondence. This term contributes to the score when Include secondary structure score (N%) is turned on. Clicking Show parameters reveals the secondary structure scoring parameters. N reflects the relative weights of the terms, which can be adjusted by moving the slider. If N is 30, for example,
total score = 0.70(residue similarity score) + 0.30(secondary structure score) - gap penalties
The values in the secondary structure Scoring matrix (for all pairwise combinations of H helix, S strand, and O other) and the secondary-structure-specific Gap opening penalties can be adjusted. Reset secondary structure scoring parameters to defaults can be used to restore the default values of all secondary structure scoring parameters.

Gap penalties. When a secondary structure term is included, the secondary-structure-specific Gap opening penalties (Intra-helix, Intra-Strand, Any other) are used instead of the single Gap opening penalty. The Gap extension penalty is the same in both cases, however. The default gap parameters are not necessarily optimal, so it is reasonable to adjust them as needed.

UCSF Computer Graphics Laboratory / July 2006