molarray: Collections of molecular objects

These classes Atoms, Bonds, Residues... provide access to collections of C++ molecular data objects. One typically gets these from an AtomicStructure object which is produced by reading a PDB file.

Data items in a collections are ordered and the same object may be repeated.

Collections have attributes such as Atoms.coords that return a numpy array of values for each object in the collection. This offers better performance than using a Python list of atoms since it directly accesses the C++ atomic data. When using a list of atoms, a Python Atom object is created for each atom which requires much more memory and is slower to use in computation. Working with lists is still often desirable when computations are not easily done using arrays of attributes. To get a list of atoms use list(x) where x is an Atoms collection. Collections behave as Python iterators so constructs such as a “for” loop over an Atoms collection is valid: “for a in atoms: ...”.

There are collections Atoms, Bonds, Pseudobonds, Residues, Chains, AtomicStructureDatas.

Some attributes return collections instead of numpy arrays. For example, atoms.residues returns a Residues collection that has one residue for each atom in the collection atoms. If only a collection unique residues are desired, use atoms.unique_residues.

Collections have base class Collection which provides many standard methods such as length, iteration, indexing with square brackets, index of an element, intersections, unions, subtraction, filtering....

Collections are immutable so can be hashed. The only case in which their contents can be altered is if C++ objects they hold are deleted in which case those objects are automatically removed from the collection.

class Collection(pointers, object_class, objects_class)

Base class of all molecular data collections that provides common methods such as length, iteration, indexing with square brackets, intersection, union, subtracting, and filtering.

__len__()

Number of objects in collection.

__iter__()

Iterator over collection objects.

__getitem__(i)

Indexing of collection objects using square brackets, e.g. c[i].

index(object)

Find the position of the first occurence of an object in a collection.

__or__(objects)

The or operator | takes the union of two collections removing duplicates.

__and__(objects)

The and operator & takes the intersection of two collections removing duplicates.

__sub__(objects)

The subtract operator “-” subtracts one collection from another as sets, eliminating all duplicates.

intersect(objects)

Return a new collection that is the intersection with the objects Collection.

intersects(objects)

Whether this collection has any element in common with the objects Collection. Returns bool.

intersects_each(objects_list)

Check if each of serveral pointer arrays intersects this array. Return a boolean array of length equal to the length of objects_list.

filter(mask)

Return a subset of the collection as a new collection.

Parameters:

mask : numpy bool array

Array length must match the length of the collection.

mask(objects)

Return bool array indicating for each object in current set whether that object appears in the argument objects.

merge(objects)

Return a new collection combining this one with the objects Collection. All duplicates are removed.

subtract(objects)

Return a new collection subtracting the objects Collection from this one. All duplicates are removed.

concatenate(collections)

Concatenate any number of collections returning a new collection. All collections must have the same type.

Parameters:collections : sequence of Collection objects
class Atoms(atom_pointers=None)

Bases: Collection

An ordered collection of atom objects. This offers better performance than using a list of atoms. It provides methods to access atom attributes such as coordinates as numpy arrays. Atoms directly accesses the C++ atomic data without creating Python Atom objects which require much more memory and are slower to use in computation.

colors

Returns a numpy Nx4 array of uint8 RGBA values. Can be set with such an array (or equivalent sequence), or with a single RGBA value.

coords

Returns a numpy Nx3 array of XYZ values. Can be set.

displays

Controls whether the Atoms should be displayed. Returns a numpy array of boolean values. Can be set with such an array (or equivalent sequence), or with a single boolean value.

visibles

Returns whether the Atoms should be visible (displayed and not hidden). Returns a numpy array of boolean values. Read only.

draw_modes

Controls how the Atoms should be depicted, e.g. sphere, ball, etc. The values are integers, SPHERE_STYLE, BALL_STYLE or STICK_STYLE as documented in the Atom class. Returns a numpy array of integers. Can be set with such an array (or equivalent sequence), or with a single integer value.

element_names

Returns a numpy array of chemical element names. Read only.

element_numbers

Returns a numpy array of atomic numbers (integers). Read only.

in_chains

Whether each atom belong to a polymer. Returns numpy bool array. Read only.

structures

Returns an AtomicStructureDatas with the structure for each atom. Read only.

names

Returns a numpy array of atom names. Read only.

radii

Returns a numpy array of atomic radii. Can be set with such an array (or equivalent sequence), or with a single floating-point number.

residues

Returns a Residues whose data items correspond in a 1-to-1 fashion with the items in the Atoms. Read only.

selected

numpy bool array whether each atom is selected.

num_selected

Number of selected atoms.

unique_structures

The unique structures as an AtomicStructureDatas collection

unique_residues

The unique Residues for these atoms.

by_structure

Return list of pairs of structure and Atoms for that structure.

by_chain

Return list of triples of structure, chain id, and Atoms for each chain.

scene_coords

Atom coordinates in the global scene coordinate system. This accounts for the Drawing positions for the hierarchy of models each atom belongs to.

delete()

Delete the C++ Atom objects

class Bonds(bond_pointers)

Bases: Collection

Collection of C++ bonds.

atoms

Returns a two-tuple of Atoms objects. For each bond, its endpoint atoms are in the matching position in the two Atoms collections. Read only.

colors

Returns a numpy Nx4 array of uint8 RGBA values. Can be set with such an array (or equivalent sequence), or with a single RGBA value.

displays

Controls whether the Bonds should be displayed. The values are integers defined in the Bond class. TODO: No values are defined and value not used for rendering. Returns a numpy array of integers. Can be set with such an array (or equivalent sequence), or with a single integer value.

visibles

Returns whether the Bonds should be visible. If hidden, the return value is Never; otherwise, same as display. Returns a numpy array of integers. Read only.

halfbonds

Controls whether the Bonds should be colored in “halfbond” mode, i.e. each half colored the same as its endpoint atom. Returns a numpy array of boolean values. Can be set with such an array (or equivalent sequence), or with a single boolean value.

radii

Returns a numpy array of bond radii (half thicknesses). Can be set with such an array (or equivalent sequence), or with a single floating-point number.

class Pseudobonds(pbond_pointers)

Bases: Collection

Holds a collection of C++ PBonds (pseudobonds) and provides access to some of their attributes. It has the same attributes as the Bonds class and works in an analogous fashion.

atoms

Returns a two-tuple of Atoms objects. For each bond, its endpoint atoms are in the matching position in the two Atoms collections. Read only.

colors

Returns a numpy Nx4 array of uint8 RGBA values. Can be set with such an array (or equivalent sequence), or with a single RGBA value.

displays

Controls whether the pseudobonds should be displayed. TODO: No values are defined and value not used for rendering. Returns a numpy array of integers. Can be set with such an array (or equivalent sequence), or with a single integer value.

halfbonds

Controls whether the pseudobonds should be colored in “halfbond” mode, i.e. each half colored the same as its endpoint atom. Returns a numpy array of boolean values. Can be set with such an array (or equivalent sequence), or with a single boolean value.

radii

Returns a numpy array of pseudobond radii (half thicknesses). Can be set with such an array (or equivalent sequence), or with a single floating-point number.

class Residues(residue_pointers)

Bases: Collection

Collection of C++ residue objects.

atoms

Return Atoms belonging to each residue all as a single collection. Read only.

chain_ids

Returns a numpy array of chain IDs. Read only.

is_helix

Returns a numpy bool array whether each residue is in a protein helix. Read only.

is_sheet

Returns a numpy bool array whether each residue is in a protein sheet. Read only.

structures

Returns AtomicStructureDatas collection containing structures for each residue.

names

Returns a numpy array of residue names. Read only.

num_atoms

Returns a numpy integer array of the number of atoms in each residue. Read only.

numbers

Returns a numpy array of residue sequence numbers, as provided by whatever data source the structure came from, so not necessarily consecutive, or starting from 1, etc. Read only.

ss_id

numpy array of integer secondary structure ids.

strs

Returns a numpy array of strings that encapsulates each residue’s name, sequence position, and chain ID in a readable form. Read only.

unique_ids

A numpy array of integers. Multiple copies of the same residue in the collection will have the same integer value in the returned array. Read only.

ribbon_displays

A numpy bool array whether to display each residue in ribbon style.

ribbon_colors

A numpy Nx4 array of uint8 RGBA values. Can be set with such an array (or equivalent sequence), or with a single RGBA value.

unique_structures

The unique structures as an AtomicStructureDatas collection

unique_chain_ids

The unique chain IDs as a numpy array of strings.

get_polymer_spline()

Return a tuple of spline center and guide coordinates for a polymer chain. Residues in the chain that do not have a center atom will have their display bit turned off. Center coordinates are returned as a numpy array. Guide coordinates are only returned if all spline atoms have matching guide atoms; otherwise, None is returned for guide coordinates.

class Chains(chain_pointers)

Bases: Collection

Collection of C++ chain objects.

chain_ids

A numpy array of string chain ids for each chain. Read only.

structures

A AtomicStructureDatas collection containing structures for each chain.

residues

A Residues containing the residues of all chains. Read only.

num_residues

A numpy integer array containing the number of residues in each chain.

class AtomicStructureDatas(mol_pointers)

Bases: Collection

Collection of C++ atomic structure objects.

atoms

A single Atoms containing atoms for all structures. Read only.

bonds

A single Bonds object containing bonds for all structures. Read only.

chains

A single Chains object containing chains for all structures. Read only.

names

A numpy string array of names of each structure.

num_atoms

Number of atoms in each structure. Read only.

num_bonds

Number of bonds in each structure. Read only.

num_chains

Number of chains in each structure. Read only.

num_residues

Number of residues in each structure. Read only.

residues

A single Residues object containing residues for all structures. Read only.

pbg_maps

Returns a list of dictionaries whose keys are pseudobond group categories (strings) and whose values are Pseudobonds. Read only.

metadata

Return a list of dictionaries with metadata. Read only.