molarray: Collections of molecular objects

These classes Atoms, Bonds, Residues… provide access to collections of C++ molecular data objects. One typically gets these from an AtomicStructure object which is produced by reading a PDB file.

Data items in a collections are ordered and the same object may be repeated.

Collections have attributes such as Atoms.coords that return a numpy array of values for each object in the collection. This offers better performance than using a Python list of atoms since it directly accesses the C++ atomic data. When using a list of atoms, a Python Atom object is created for each atom which requires much more memory and is slower to use in computation. Working with lists is still often desirable when computations are not easily done using arrays of attributes. To get a list of atoms use list(x) where x is an Atoms collection. Collections behave as Python iterators so constructs such as a “for” loop over an Atoms collection is valid: “for a in atoms: …”.

There are collections Atoms, Bonds, Pseudobonds, Residues, Chains, AtomicStructures.

Some attributes return collections instead of numpy arrays. For example, atoms.residues returns a Residues collection that has one residue for each atom in the collection atoms. If only a collection unique residues are desired, use atoms.unique_residues.

Collections have base class Collection which provides many standard methods such as length, iteration, indexing with square brackets, index of an element, intersections, unions, subtraction, filtering….

Collections are mostly immutable. The only case in which their contents can be altered is if C++ objects they hold are deleted in which case those objects are automatically removed from the collection. Because they are mutable they cannot be used as keys in dictionary or added to sets.

class Collection(items, object_class, objects_class)

Bases: chimerax.core.state.State

Base class of all molecular data collections that provides common methods such as length, iteration, indexing with square brackets, intersection, union, subtracting, and filtering. By design, a Collection is immutable.

hash()

Can be used for quickly determining if collections have the same elements in the same order. Objects are automatically deleted from the collection when the C++ object is deleted. So this hash value will not be valid if the collection changes. This is not the __hash__ special Python method and it is not supported to use collections as keys of dictionaries or elements of sets since they are mutable.

__len__()

Number of objects in collection.

__iter__()

Iterator over collection objects.

__getitem__(i)

Indexing of collection objects using square brackets, e.g. c[i].

index(object)

Find the position of the first occurence of an object in a collection.

indices(objects)

Return int32 array indicating for each element in objects its index of the first occurence in the collection, or -1 if it does not occur in the collection.

__or__(objects)

The or operator | takes the union of two collections removing duplicates.

__and__(objects)

The and operator & takes the intersection of two collections removing duplicates.

__sub__(objects)

The subtract operator “-” subtracts one collection from another as sets, eliminating all duplicates.

copy()

Shallow copy, since Collections are immutable.

intersect(objects)

Return a new collection that is the intersection with the objects Collection.

intersects(objects)

Whether this collection has any element in common with the objects Collection. Returns bool.

intersects_each(objects_list)

Check if each of serveral pointer arrays intersects this array. Return a boolean array of length equal to the length of objects_list.

filter(mask_or_indices)

Return a subset of the collection as a new collection.

Parameters:

mask_or_indices : numpy bool array (mask) or int array (indices)

Bool length must match the length of the collection and filters out items where the bool array is False.

mask(objects)

Return bool array indicating for each object in current set whether that object appears in the argument objects.

merge(objects)

Return a new collection combining this one with the objects Collection. All duplicates are removed.

subtract(objects)

Return a new collection subtracting the objects Collection from this one. All duplicates are removed. Currently does not preserve order

unique()

Return a new collection containing the unique elements from this one, preserving order.

instances(instantiate=True)

Returns a list of the Python instances. If ‘instantiate’ is False, then for those items that haven’t yet been instantiated, None will be returned.

concatenate(collections, object_class=None, remove_duplicates=False)

Concatenate any number of collections returning a new collection. All collections must have the same type.

Parameters:collections : sequence of Collection objects
unique_ordered(a)

Return unique elements of numpy array a preserving order.

class Atoms(c_pointers=None, guaranteed_live_pointers=False)

Bases: chimerax.core.atomic.molarray.Collection

Bases: Collection

An ordered collection of atom objects. This offers better performance than using a list of atoms. It provides methods to access atom attributes such as coordinates as numpy arrays. Atoms directly accesses the C++ atomic data without creating Python Atom objects which require much more memory and are slower to use in computation.

by_chain

Return list of triples of structure, chain id, and Atoms for each chain.

by_structure

Return list of 2-tuples of (structure, Atoms for that structure).

colors

Returns a numpy Nx4 array of uint8 RGBA values. Can be set with such an array (or equivalent sequence), or with a single RGBA value.

coords

Returns a numpy Nx3 array of XYZ values. Can be set.

coord_indices

Coordinate index of atom in coordinate set.

displays

Controls whether the Atoms should be displayed. Returns a numpy array of boolean values. Can be set with such an array (or equivalent sequence), or with a single boolean value.

draw_modes

Controls how the Atoms should be depicted. The values are integers, SPHERE_STYLE, BALL_STYLE or STICK_STYLE as documented in the Atom class. Returns a numpy array of integers. Can be set with such an array (or equivalent sequence), or with a single integer value.

elements

Returns a Elements whose data items correspond in a 1-to-1 fashion with the items in the Atoms. Read only.

element_names

Returns a numpy array of chemical element names. Read only.

element_numbers

Returns a numpy array of atomic numbers (integers). Read only.

names

Returns a numpy array of atom names. Can be set with such an array (or equivalent sequence), or with a single string. Atom names are limited to 4 characters.

hides

Whether atom is hidden (overrides display). Returns a numpy array of int32 bitmask.

Possible values:

HIDE_RIBBON
Hide mask for backbone atoms in ribbon.

Can be set with such an array (or equivalent sequence), or with a single integer value.

idatm_types

Returns a numpy array of IDATM types. Can be set with such an array (or equivalent sequence), or with a single string.

in_chains

Whether each atom belong to a polymer. Returns numpy bool array. Read only.

is_riboses

Whether each atom is part of an nucleic acid ribose moiety. Returns numpy bool array. Read only.

is_side_chains

Whether each atom is part of an amino/nucleic acid sidechain. Includes atoms needed to connect to backbone (CA/ribose). Returns numpy bool array. Read only.

is_side_connectors

Whether each atom is needed to connect to backbone (CA/ribose). Returns numpy bool array. Read only.

is_side_onlys

Whether each atom is part of an amino/nucleic acid sidechain. Does not include atoms needed to connect to backbone (CA/ribose). Returns numpy bool array. Read only.

intra_bonds

Bonds object where both endpoint atoms are in this collection

radii

Returns a numpy array of radii. Can be set with such an array (or equivalent sequence), or with a single floating-point number.

default_radii

Returns a numpy array of default radii.

maximum_bond_radii(default_radius=0.2)

Return maximum bond radius for each atom. Used for stick style atom display.

residues

Returns a Residues whose data items correspond in a 1-to-1 fashion with the items in the Atoms. Read only.

scene_bounds

Return scene bounds of atoms including instances of all parent models.

scene_coords

Atoms’ coordinates in the global scene coordinate system. This accounts for the Drawing positions for the hierarchy of models each atom belongs to.

selected

numpy bool array whether each Atom is selected.

selecteds

numpy bool array whether each Atom is selected.

num_selected

Number of selected atoms.

has_selected_bonds

For each atom is any connected bond selected.

serial_numbers

Serial numbers of atoms

shown_atoms

Subset of Atoms including atoms that are displayed or have ribbon displayed and have displayed structure and displayed parent models.

structure_categories

Numpy array of whether atom is ligand, ion, etc.

structures

Returns an AtomicStructure for each atom. Read only.

unique_residues

The unique Residues for these atoms.

unique_chain_ids

The unique chain IDs as a numpy array of strings.

unique_structures

The unique structures as an AtomicStructures collection

full_residues

The Residues all of whose atoms are in this Atoms instance

full_structures

The Structures all of whose atoms are in this Atoms instance

single_structure

Do all atoms belong to a single Structure

visibles

Returns whether the Atom should be visible (displayed and not hidden). Returns a numpy array of boolean values. Read only.

alt_locs

Returns current alternate location indicators

delete()

Delete the C++ Atom objects

update_ribbon_visibility()

Update the ‘hide’ status for ribbon control point atoms, which are hidden unless any of its neighbors are visible.

has_aniso_u

Boolean array identifying which atoms have anisotropic temperature factors.

aniso_u

Anisotropic temperature factors, returns Nx3x3 array of numpy float32 or None if any of the atoms does not have temperature factors. Read only.

aniso_u6

Get anisotropic temperature factors as a Nx6 array of numpy float32 containing (u11,u22,u33,u12,u13,u23) for each atom or None if any of the atoms does not have temperature factors.

residue_sums(atom_values)

Compute per-residue sum of atom float values. Return unique residues and array of residue sums.

class Bonds(bond_pointers=None)

Bases: chimerax.core.atomic.molarray.Collection

Bases: Collection

Collection of C++ bonds.

atoms

Returns a two-tuple of Atoms objects. For each bond, its endpoint atoms are in the matching position in the two Atoms collections. Read only.

colors

Returns a numpy Nx4 array of uint8 RGBA values. Can be set with such an array (or equivalent sequence), or with a single RGBA value.

displays

Controls whether the Bonds should be displayed. Returns a numpy array of bool. Can be set with such an array (or equivalent sequence), or with a single value. Bonds are shown only if display is true, hide is false, and both atoms are shown.

visibles

Returns whether the Bonds should be visible regardless of whether the atoms on either end is shown. Returns a numpy array of bool. Read only.

halfbonds

Controls whether the Bonds should be colored in “halfbond” mode, i.e. each half colored the same as its endpoint atom. Returns a numpy array of boolean values. Can be set with such an array (or equivalent sequence), or with a single boolean value.

radii

Returns a numpy array of bond radii (half thicknesses). Can be set with such an array (or equivalent sequence), or with a single floating-point number.

selected

numpy bool array whether each Bond is selected.

ends_selected

For each bond are both of its endpoint atoms selected.

showns

Whether each bond is displayed, visible and has both atoms shown, and at least one atom is not Sphere style.

structures

Returns an StructureDatas with the structure for each bond. Read only.

unique_structures

The unique structures as an AtomicStructures collection

by_structure

Return list of 2-tuples of (structure, Bonds for that structure).

delete()

Delete the C++ Bonds objects

num_shown

Number of bonds shown.

num_selected

Number of selected bonds.

half_colors

2N x 4 RGBA uint8 numpy array of half bond colors.

halfbond_cylinder_placements(opengl_array=None)

Return Places for halfbond cylinders specified by 2N 4x4 float matrices.

class Elements(element_pointers)

Bases: chimerax.core.atomic.molarray.Collection

Bases: Collection

Holds a collection of C++ Elements (chemical elements) and provides access to some of their attributes. Used for the same reasons as the Atoms class.

names

Returns a numpy array of chemical element names. Read only.

numbers

Returns a numpy array of atomic numbers (integers). Read only.

masses

Returns a numpy array of atomic masses, taken from http://en.wikipedia.org/wiki/List_of_elements_by_atomic_weight. Read only.

is_alkali_metal

Returns a numpy array of booleans, where True indicates the element is atom an alkali metal. Read only.

is_halogen

Returns a numpy array of booleans, where True indicates the element is atom a halogen. Read only.

is_metal

Returns a numpy array of booleans, where True indicates the element is atom a metal. Read only.

is_noble_gas

Returns a numpy array of booleans, where True indicates the element is atom a noble gas. Read only.

valences

Returns a numpy array of atomic valence numbers (integers). Read only.

class Pseudobonds(pbond_pointers=None)

Bases: chimerax.core.atomic.molarray.Collection

Bases: Collection

Holds a collection of C++ PBonds (pseudobonds) and provides access to some of their attributes. It has the same attributes as the Bonds class and works in an analogous fashion.

atoms

Returns a two-tuple of Atoms objects. For each bond, its endpoint atoms are in the matching position in the two Atoms collections. Read only.

colors

Returns a numpy Nx4 array of uint8 RGBA values. Can be set with such an array (or equivalent sequence), or with a single RGBA value.

displays

Controls whether the pseudobonds should be displayed. Returns a numpy array of bool. Can be set with such an array (or equivalent sequence), or with a single value. Pseudobonds are shown only if display is true, hide is false, and both atoms are shown.

groups

Returns a PseudobondGroups collection of the pseudobond groups these pseudobonds belong to

halfbonds

Controls whether the pseudobonds should be colored in “halfbond” mode, i.e. each half colored the same as its endpoint atom. Returns a numpy array of boolean values. Can be set with such an array (or equivalent sequence), or with a single boolean value.

radii

Returns a numpy array of pseudobond radii (half thicknesses). Can be set with such an array (or equivalent sequence), or with a single floating-point number.

selected

numpy bool array whether each Pseudobond is selected.

showns

Whether each pseudobond is displayed, visible and has both atoms displayed.

shown_when_atoms_hiddens

Controls whether the pseudobond is shown when the endpoint atoms are not explictly displayed (atom.display == False) but are implicitly shown by a ribbon or somesuch (atom.hide != 0). Defaults to True.

delete()

Delete the C++ Pseudobond objects

lengths

Distances between pseudobond end points.

half_colors

2N x 4 RGBA uint8 numpy array of half bond colors.

between_atoms(atoms)

Return mask of those pseudobonds which have both ends in the given set of atoms.

unique_structures

The unique structures as a StructureDatas collection

by_group

Return list of 2-tuples of (PseudobondGroup, Pseudobonds for that group).

num_selected

Number of selected pseudobonds.

class Residues(residue_pointers=None, residues=None)

Bases: chimerax.core.atomic.molarray.Collection

Bases: Collection

Collection of C++ residue objects.

atoms

Return Atoms belonging to each residue all as a single collection. Read only.

centers

Average of atom positions as a numpy length 3 array, 64-bit float values.

chains

Return Chains for residues. Residues with no chain are omitted. Read only.

chain_ids

Returns a numpy array of chain IDs. Read only.

mmcif_chain_ids

Returns a numpy array of chain IDs. Read only.

insertion_codes

Returns a numpy array of insertion codes. An empty string indicates no insertion code.

is_helix

Returns a numpy bool array whether each residue is in a protein helix

is_strand

Returns a numpy bool array whether each residue is in a protein sheet

names

Returns a numpy array of residue names. Read only.

num_atoms

Returns a numpy integer array of the number of atoms in each residue. Read only.

numbers

Returns a numpy array of residue sequence numbers, as provided by whatever data source the structure came from, so not necessarily consecutive, or starting from 1, etc. Read only.

polymer_types

Returns a numpy int array of residue types. Read only.

principal_atoms

List of the ‘chain trace’ Atoms or None (for residues without such an atom).

Normally returns the C4’ from a nucleic acid since that is always present, but in the case of a P-only trace it returns the P.

existing_principal_atoms

Like the principal_atoms property, but returns a Residues collection omitting Nones

ribbon_displays

A numpy bool array whether to display each residue in ribbon style.

ribbon_colors

A numpy Nx4 array of uint8 RGBA values. Can be set with such an array (or equivalent sequence), or with a single RGBA value.

ribbon_adjusts

A numpy float array of adjustment factors for the position of ribbon control points. Factors range from zero to one, with zero being using the actual atomic coordinates as control point, and one being using the idealized secondary structure position as control point. A negative value means to use the default of zero for turns and helices and 0.7 for strands.

ribbon_hide_backbones

A numpy array of booleans. Whether a ribbon automatically hides the residue backbone atoms.

secondary_structure_ids

A numpy array of integer secondary structure ids. Every helix, sheet, coil has a unique integer id. The ids depend on the collection of residues on the fly and are not persistent. Read only.

ss_ids

A numpy array of integer secondary structure IDs, determined by the input file. For a PDB file, for helices, the ID is the same as in the HELIX record; for strands, it starts as 1 for the strand nearest the N terminus, and increments for each strand out to the C terminus.

ss_types

Returns a numpy integer array of secondary structure types (one of: Residue.SS_COIL, Residue.SS_HELIX, Residue.SS_STRAND [or SS_SHEET])

structures

Returns StructureDatas collection containing structures for each residue.

delete()

Delete the C++ Residue objects

unique_structures

The unique structures as a StructureDatas collection

unique_chain_ids

The unique chain IDs as a numpy array of strings.

unique_chains

The unique chains as a Chains collection

by_chain

Return list of structure, chain id, and Residues for each chain.

by_structure

Return list of pairs of structure and Residues for that structure.

unique_ids

A numpy array of uintp (unsigned integral type large enough to hold a pointer). Multiple copies of the same residue in the collection will have the same integer value in the returned array. Read only.

unique_sequences

Return a list of sequence strings and a numpy array giving an integer index for each residue. Index 0 is for residues that are not part of a chain (empty string).

get_polymer_spline()

Return a tuple of spline center and guide coordinates for a polymer chain. Residues in the chain that do not have a center atom will have their display bit turned off. Center coordinates are returned as a numpy array. Guide coordinates are only returned if all spline atoms have matching guide atoms; otherwise, None is returned for guide coordinates.

ribbon_clear_hides()

Clear the hide bit for all atoms in given residues.

ribbon_num_selected

Number of selected residue ribbons.

ribbon_selected

numpy bool array whether each Residue ribbon is selected.

class Rings(ring_pointers=None, rings=None)

Bases: chimerax.core.atomic.molarray.Collection

Bases: Collection

Collection of C++ ring objects.

aromatics

A numpy bool array whether corresponding ring is aromatic.

atoms

Return Atoms belonging to each ring all as a single collection. Read only.

bonds

Return Bonds belonging to each ring all as a single collection. Read only.

sizes

Returns a numpy integer array of the size of each ring. Read only.

class Chains(chain_pointers)

Bases: chimerax.core.atomic.molarray.Collection

Bases: Collection

Collection of C++ chain objects.

chain_ids

A numpy array of string chain ids for each chain. Read only.

structures

A StructureDatas collection containing structures for each chain.

existing_residues

A Residues containing the existing residues of all chains. Read only.

num_existing_residues

A numpy integer array containing the number of existing residues in each chain.

num_residues

A numpy integer array containing the number of residues in each chain.

polymer_types

Returns a numpy int array of residue types. Same values as Residues.polymer_types except shouldn’t return PT_NONE.

class StructureDatas(mol_pointers)

Bases: chimerax.core.atomic.molarray.Collection

Bases: Collection

Collection of C++ atomic structure objects.

alt_loc_change_notifies

Whether notifications are issued when altlocs are changed. Should only be set to true when temporarily changing alt locs in a Python script. Numpy bool array.

atoms

A single Atoms containing atoms for all structures. Read only.

bonds

A single Bonds object containing bonds for all structures. Read only.

chains

A single Chains object containing chains for all structures. Read only.

lower_case_chains

A numpy bool array of lower_case_names of each structure.

num_atoms

Number of atoms in each structure. Read only.

num_bonds

Number of bonds in each structure. Read only.

num_chains

Number of chains in each structure. Read only.

num_residues

Number of residues in each structure. Read only.

residues

A single Residues object containing residues for all structures. Read only.

pbg_maps

Returns a list of dictionaries whose keys are pseudobond group categories (strings) and whose values are Pseudobonds. Read only.

ribbon_tether_scales

Returns an array of scale factors for ribbon tethers.

ribbon_tether_sides

Returns an array of numbers of sides for ribbon tethers.

ribbon_tether_shapes

Returns an array of shapes for ribbon tethers.

metadata

Return a list of dictionaries with metadata. Read only.

ribbon_tether_opacities

Returns an array of opacity scale factor for ribbon tethers.

ribbon_show_spines

Returns an array of booleans of whether to show ribbon spines.

ribbon_orientations

Returns an array of ribbon orientations.

ss_assigneds

Whether secondary structure has been assigned, either from data in the original structure file, or from an algorithm (e.g. dssp command)

class AtomicStructures(mol_pointers)

Bases: chimerax.core.atomic.molarray.StructureDatas

Bases: StructureDatas

Collection of Python atomic structure objects.

class PseudobondGroupDatas(pbg_pointers)

Bases: chimerax.core.atomic.molarray.Collection

Bases: Collection

Collection of C++ pseudobond group objects.

pseudobonds

A single Pseudobonds object containing pseudobonds for all groups. Read only.

names

A numpy string array of categories of each group.

num_bonds

Number of pseudobonds in each group. Read only.

class PseudobondGroups(pbg_pointers)

Bases: chimerax.core.atomic.molarray.PseudobondGroupDatas

Bases: PseudobondGroupDatas

Collection of Python pseudobond group objects.

class CoordSets(cs_pointers=None)

Bases: chimerax.core.atomic.molarray.Collection

Bases: Collection

Collection of C++ coordsets.

ids

ID numbers of coordsets

structures

Returns an AtomicStructure for each coordset. Read only.