Atom Specification

Atoms may be specified in the Command Line by name, property, or proximity. In addition, the following are valid atom specifications:

If a saved selection has the same name as a surface category, the saved selection is used. Those familiar with atom specification in Midas may wish to consult the summary of differences.

Multiple models (structures from multiple input files) may be displayed simultaneously. Model numbers are assigned by the user or sequentially by default with the open command. Each model consists of one or more residues, each with a sequence number. The atoms that make up a residue have names which are unique within that residue. Thus, each atom may be described by its model number, residue number, and atom name.

The residue and atom names are determined when the input file is read in, and generally match the standard Protein DataBank (PDB) residue and atom names. If * occurs within an atom or residue name read from a PDB file, it is changed to ' (prime symbol). Any HETATM residue that does not already have a chain identifier is assigned to chain HET, unless the residue is named WAT or HOH, in which case it is assigned to chain WATER. The residue numbers are unchanged.

BASIC ATOM SPECIFICATION
Symbol Reference Level Definition
# model number assigned to the model by default or by the user with the open command.
: residue residue name
OR
residue sequence number, with any insertion code appended
:: residue residue name
@ atom atom name

Note that the lack of any specifier is interpreted to mean all units of the associated reference level. Multiple model numbers or residue numbers may be indicated by comma-separated lists and/or one or more ranges of the form start-end. The words start and end may be substituted for start and end, respectively. The word all may be used to indicate the entire range. Capitalization of residue and atom names, chain identifiers, insertion codes, or alternate location identifiers (more on these below) is not important. Model and residue numbers are integers, although a residue number may have an insertion code directly appended.

In some instances, the basic symbols do not uniquely specify the model, residue, or atom of interest. If a PDB input file contains multiple MODELs (often the result of NMR structure determination), they are all read into a single "model" in Chimera and sequentially assigned submodel numbers starting with 1 (the numbers in the file are not used). Similarly, within a model, different chains may contain the same residue number. Finally, within a residue, certain atoms may appear more than once in alternate locations. These subcategorizations are appended to the basic specification. The symbol for the relevant category (#, :, or @) must precede the subcategory specification, although they need not be immediately adjacent.

SUBCATEGORIES
Reference Level General Form Definition
model model(s).submodel(s) when a single PDB file contains multiple MODELs, they are considered submodels 1, 2, ... of a single model in Chimera
residue residue(s).chain(s) when a single model contains multiple chains, a unique specification includes both residue number and chain identifier
atom atom(s).altloc(s) when a single residue contains alternate locations of certain atoms, an independent specification includes both atom name and alternate location identifier

Submodel numbers may be indicated by a single value or a range of the form start-end. The words start and end may be substituted for start and end, respectively. The word all may be used to indicate the entire range. Submodel(s) are integers, whereas chain(s) and altloc(s) are alphabetical characters. Because commas are used only to separate values of the basic reference levels (model, residue, and atom), they cannot be used to separate values of the sublevels directly. For example,

#0.1-3,5
is interpreted as submodels 1-3 of model 0 and all of model 5, while
#0.1-3,.5
indicates submodels 1-3 of model 0 and submodel 5 of all models. A subcategory specification applies only to the preceding category value(s) not separated from it by any commas (and remember that the absence of a value means "all"); thus
:12-15,16-18.a,15.b@ca
specifies CA atoms within residues 12-15 (with no chain ID), residues 16 to 18 in chain A, and residue 15 within chain B, and
:50.B,.D
means residue 50 in chain B and all residues in chain D.

Additional examples:

#0
- all atoms in model 0
:.A@ca,c,n,o
- peptide backbone atoms in chain A
#1:50.HET
- HETATM residue 50 in model 1
:522.water
- water residue 522 (a HETATM residue named HOH or WAT)
#0:12.A@CA
- alpha carbon of residue 12 in chain A, model 0
:lys
- or -
::lys
- all lysine residues
#3:45-83,90-98
- residues 45 through 83 and 90 through 98 in model 3
#0:12@CA@N
- alpha carbon of residue 12 in model 0 and nitrogen of residue 12 in model 0
#0:12@CA,N
- alpha carbon and nitrogen of residue 12 in model 0

Notice in the two preceding examples that the atoms CA and N may be delimited by either a comma or the @ symbol. In either case, the preceding (most recent) molecule and residue information applies to the named atoms. Using the @ notation for both atoms specifies their order. Using commas specifies a "group" of atoms, in no particular order. Thus, in specifications where the order of the atoms is significant (e.g., the match command), the @ notation should be used. For models and residues, the same conventions are followed, with @ replaced by # and :, respectively. For example, for atoms on different residues but the same model:

#1:12,14@CA
- alpha carbons in residue 12 and residue 14
#1:12:14@CA
- all atoms in residue 12 and alpha carbon in residue 14

Of the two examples above, the first gives two residues which make up a single residue specification. Therefore, the carbon atoms in both residues are selected. In the second example, the entire residue 12 and only the carbon in residue 14 are selected.

#1:12-20@CA:14@N
- alpha carbons in residues 12 through 20 and nitrogen in residue 14

Wild Cards

The global wild card * matches all atoms in a residue or all residues in a model (same as using the word all). It stands alone as a symbol, that is, it cannot be used to match parts of names, such as G* or *A. The partial wild card = matches parts of atom or residue names but not parts of residue sequence numbers; similarly, the single-character wild card ? matches single characters within residue or atom names but not single digits within residue sequence numbers. For example:

#1:12@*
- or -
#1:12
- all atoms in residue 12 of model 1
#0,1,2:50-*@CA
- all alpha carbon atoms in residues 50 to the end of models 0, 1 and 2
#2:G??
- all residues which have three-letter names beginning with G in model 2
:fmn@?1
- atoms within residue FMN which have two-letter names ending with 1
@S=
- all atoms which have names beginning with S; in general, this will be all sulfur atoms
#0:*.*@H@H?@H??
- or -
#0@H@H?@H??
- all atoms with one-, two-, or three-letter names beginning with H in model 0

Property Descriptors

Atoms, residues, and models have descriptors or properties that are specified with the slash mark /.

The symbol for the relevant category (@ for atom descriptors, : for residue descriptors, or # for model descriptors) must precede the slash mark, although they need not be immediately adjacent.

Multiple descriptors at the same reference level (different atom properties, for example) can follow a single slash mark and should be separated by and or or. When and and or both occur in the same list, and has higher priority (and-separated lists can be considered grouped within parentheses).

The descriptor names are case-sensitive; the descriptor values, if any, are case-sensitive if specified with ==, but not if specified with = (as shown in the table below). Descriptor values containing spaces (some color names, for example) must be enclosed by double quotes. Descriptors with numerical values can also be used with > (greater than), < (less than), >= (greater than or equal to), and <= (less than or equal to).

The exclamation mark ! indicates that the atoms, residues, or models must not have a property. For yes/no properties the syntax is !descriptor_name, and for multivalued properties the syntax is descriptor_name!=value.

The operators ~ and !~ can be used instead of = and !=, respectively, to indicate that the subsequent string should be treated as a regular expression.

ATOM DESCRIPTORS
Name and Usage Explanation
/altLoc=altloc altloc is the alternate location identifier of the atom
/bfactor=bfactor bfactor is the B-factor value of the atom
/color=color color is the color of the atom (assigned on a per-atom basis; see coloring hierarchy)
/drawMode=mode mode can be 0 (synonyms: dot, wire, wireframe), 1 (sphere, cpk, space-filling), 2 (endcap, stick), or 3 (ball, ball and stick, ball-and-stick, ball+stick, bs, b+s); see draw mode
/defaultRadius=rad rad is the default VDW radius of the atom in angstroms
/display whether the atom is displayed (see display hierarchy)
/element=atno atno is the atomic number
/idatmType=type type is the atom type
/label whether the atom is labeled
/label=label label is the text of the atom's label
/labelColor=labcolor labcolor is the color of the label
/name=name name is simply the atom name
/occupancy=occupancy occupancy is the occupancy value of the atom
/radius=radius radius is the radius of the atom in angstroms (may have been changed by the user from the default VDW radius)
/serialNumber=n n is the atom serial number in the input file
/surfaceCategory=category category is the name of the category to which the atom has been assigned automatically or manually using msms cat (or surfcat)
/surfaceColor=surfcolor surfcolor is the color of the atom's surface
/surfaceDisplay whether the atom's molecular surface is displayed (see display hierarchy)
/vdw whether the atom's VDW surface is displayed

Some examples:

@ca/!label and color!=green and color!=red
- or -
@/name=ca and !label and color!=green and color!=red
- all atoms named CA which are not labeled, and are not green or red.
@n/drawMode=1 and color=green
- green atoms named N drawn as spheres
@n/drawMode=1 or color=green
- atoms named N that are green, or are drawn as spheres, or are both green and drawn as spheres (N atoms that are green and N atoms of any color that are drawn as spheres)
@/color=yellow or color=blue and label
- atoms that are yellow and atoms that are both blue and labeled
@/color!=yellow or color!=blue
- all atoms, because if an atom is yellow it fulfills the criterion of not being blue, and vice versa
@/bfactor>=50
- atoms with B-factor values greater than or equal to 50

RESIDUE DESCRIPTORS
Name and Usage Explanation
/color=color color is the color assigned on a per-residue basis (see coloring hierarchy)
/isHelix whether a residue is in an alpha helix (true only possible for amino acids)
/isSheet
OR
/isStrand
whether a residue is in a beta strand (true only possible for amino acids)
/isTurn whether a residue is in a turn according to PDB TURN records (true only possible for amino acids)
/type=resname resname is the residue name

Amino acid residue secondary structure properties are determined from HELIX and SHEET records in the input file, or if these are not present, using ksdssp.

Examples:

:/type!=gly and type!=pro
- all residues not named GLY or PRO
:/isStrand :/isHelix
- or -
:/isStrand or isHelix
- all amino acid residues in beta strands or alpha helices
:/isStrand and isHelix
- nothing, because the criteria are mutually exclusive

MODEL DESCRIPTORS
Name and Usage Explanation
/color=color color is the color assigned on a per-model basis (see coloring hierarchy)
/display whether display is enabled at the model level (see display hierarchy)
/explicitHydrogens whether the model has hydrogen atoms
/lineWidth=width width is the linewidth of bonds in the model in the wireframe representation
/pointSize=size size is the font size of labels on the model
/vdwDensity=density density is the dot density used for VDW surfaces on the model

Zones

Zone specifiers are used to select atoms and residues that are within a given distance of the referenced atom(s). z< and zr< specify all residues with any atom within the given distance from the referenced atoms. za< specifies all atoms within the given distance. z>, zr>, and za> yield the sets complementary to their < counterparts. For example,

#1:gtp za<10.5
selects all atoms within 10.5 angstroms of any atom in residue GTP, model 1.

For more complicated atom specifications, intersections can be handled with the & operator. For example, one may want negatively charged amino acids in model 1 which are within 10 angstroms of model 0:

#1:asp,glu & #0 z<10
- or -
#1:/type=asp or type=glu & #0 z<10