molarray: Collections of molecular objects¶
These classes Atoms, Bonds, Residues... provide access to collections of C++
molecular data objects. One typically gets these from an AtomicStructure
object which is produced by reading a PDB file.
Data items in a collections are ordered and the same object may be repeated.
Collections have attributes such as Atoms.coords that return a numpy array of
values for each object in the collection. This offers better performance
than using a Python list of atoms since it directly accesses the C++ atomic data.
When using a list of atoms, a Python Atom
object is created for each
atom which requires much more memory and is slower to use in computation.
Working with lists is still often desirable when computations are not easily
done using arrays of attributes. To get a list of atoms use list(x) where
x is an Atoms collection. Collections behave as Python iterators so constructs
such as a “for” loop over an Atoms collection is valid: “for a in atoms: ...”.
There are collections Atoms, Bonds, Pseudobonds, Residues, Chains, AtomicStructureDatas.
Some attributes return collections instead of numpy arrays. For example, atoms.residues returns a Residues collection that has one residue for each atom in the collection atoms. If only a collection unique residues are desired, use atoms.unique_residues.
Collections have base class Collection
which provides many standard methods
such as length, iteration, indexing with square brackets, index of an element,
intersections, unions, subtraction, filtering....
Collections are immutable so can be hashed. The only case in which their contents can be altered is if C++ objects they hold are deleted in which case those objects are automatically removed from the collection.
-
class
Collection
(pointers, object_class, objects_class)¶ Base class of all molecular data collections that provides common methods such as length, iteration, indexing with square brackets, intersection, union, subtracting, and filtering.
-
__len__
()¶ Number of objects in collection.
-
__iter__
()¶ Iterator over collection objects.
-
__getitem__
(i)¶ Indexing of collection objects using square brackets, e.g. c[i].
-
index
(object)¶ Find the position of the first occurence of an object in a collection.
-
__or__
(objects)¶ The or operator | takes the union of two collections removing duplicates.
-
__and__
(objects)¶ The and operator & takes the intersection of two collections removing duplicates.
-
__sub__
(objects)¶ The subtract operator “-” subtracts one collection from another as sets, eliminating all duplicates.
-
intersect
(objects)¶ Return a new collection that is the intersection with the objects
Collection
.
-
intersects
(objects)¶ Whether this collection has any element in common with the objects
Collection
. Returns bool.
-
intersects_each
(objects_list)¶ Check if each of serveral pointer arrays intersects this array. Return a boolean array of length equal to the length of objects_list.
-
filter
(mask)¶ Return a subset of the collection as a new collection.
Parameters: mask : numpy bool array
Array length must match the length of the collection.
-
mask
(objects)¶ Return bool array indicating for each object in current set whether that object appears in the argument objects.
-
merge
(objects)¶ Return a new collection combining this one with the objects
Collection
. All duplicates are removed.
-
subtract
(objects)¶ Return a new collection subtracting the objects
Collection
from this one. All duplicates are removed.
-
-
concatenate
(collections)¶ Concatenate any number of collections returning a new collection. All collections must have the same type.
Parameters: collections : sequence of Collection
objects
-
class
Atoms
(atom_pointers=None)¶ Bases:
Collection
An ordered collection of atom objects. This offers better performance than using a list of atoms. It provides methods to access atom attributes such as coordinates as numpy arrays. Atoms directly accesses the C++ atomic data without creating Python
Atom
objects which require much more memory and are slower to use in computation.-
colors
¶ Returns a
numpy
Nx4 array of uint8 RGBA values. Can be set with such an array (or equivalent sequence), or with a single RGBA value.
-
displays
¶ Controls whether the Atoms should be displayed. Returns a
numpy
array of boolean values. Can be set with such an array (or equivalent sequence), or with a single boolean value.
-
visibles
¶ Returns whether the Atoms should be visible (displayed and not hidden). Returns a
numpy
array of boolean values. Read only.
-
draw_modes
¶ Controls how the Atoms should be depicted, e.g. sphere, ball, etc. The values are integers, SPHERE_STYLE, BALL_STYLE or STICK_STYLE as documented in the
Atom
class. Returns anumpy
array of integers. Can be set with such an array (or equivalent sequence), or with a single integer value.
-
element_names
¶ Returns a numpy array of chemical element names. Read only.
-
in_chains
¶ Whether each atom belong to a polymer. Returns numpy bool array. Read only.
-
structures
¶ Returns an
AtomicStructureDatas
with the structure for each atom. Read only.
-
names
¶ Returns a numpy array of atom names. Read only.
-
radii
¶ Returns a
numpy
array of atomic radii. Can be set with such an array (or equivalent sequence), or with a single floating-point number.
-
residues
¶ Returns a
Residues
whose data items correspond in a 1-to-1 fashion with the items in the Atoms. Read only.
-
selected
¶ numpy bool array whether each atom is selected.
-
num_selected
¶ Number of selected atoms.
-
unique_structures
¶ The unique structures as an
AtomicStructureDatas
collection
-
by_structure
¶ Return list of pairs of structure and Atoms for that structure.
-
by_chain
¶ Return list of triples of structure, chain id, and Atoms for each chain.
-
scene_coords
¶ Atom coordinates in the global scene coordinate system. This accounts for the
Drawing
positions for the hierarchy of models each atom belongs to.
-
delete
()¶ Delete the C++ Atom objects
-
-
class
Bonds
(bond_pointers)¶ Bases:
Collection
Collection of C++ bonds.
-
atoms
¶ Returns a two-tuple of
Atoms
objects. For each bond, its endpoint atoms are in the matching position in the twoAtoms
collections. Read only.
-
colors
¶ Returns a
numpy
Nx4 array of uint8 RGBA values. Can be set with such an array (or equivalent sequence), or with a single RGBA value.
-
displays
¶ Controls whether the Bonds should be displayed. The values are integers defined in the
Bond
class. TODO: No values are defined and value not used for rendering. Returns anumpy
array of integers. Can be set with such an array (or equivalent sequence), or with a single integer value.
-
visibles
¶ Returns whether the Bonds should be visible. If hidden, the return value is Never; otherwise, same as display. Returns a
numpy
array of integers. Read only.
-
-
class
Pseudobonds
(pbond_pointers)¶ Bases:
Collection
Holds a collection of C++ PBonds (pseudobonds) and provides access to some of their attributes. It has the same attributes as the
Bonds
class and works in an analogous fashion.-
atoms
¶ Returns a two-tuple of
Atoms
objects. For each bond, its endpoint atoms are in the matching position in the twoAtoms
collections. Read only.
-
colors
¶ Returns a
numpy
Nx4 array of uint8 RGBA values. Can be set with such an array (or equivalent sequence), or with a single RGBA value.
-
displays
¶ Controls whether the pseudobonds should be displayed. TODO: No values are defined and value not used for rendering. Returns a
numpy
array of integers. Can be set with such an array (or equivalent sequence), or with a single integer value.
-
-
class
Residues
(residue_pointers)¶ Bases:
Collection
Collection of C++ residue objects.
-
chain_ids
¶ Returns a numpy array of chain IDs. Read only.
-
is_helix
¶ Returns a numpy bool array whether each residue is in a protein helix. Read only.
-
is_sheet
¶ Returns a numpy bool array whether each residue is in a protein sheet. Read only.
-
structures
¶ Returns
AtomicStructureDatas
collection containing structures for each residue.
-
names
¶ Returns a numpy array of residue names. Read only.
-
num_atoms
¶ Returns a numpy integer array of the number of atoms in each residue. Read only.
-
numbers
¶ Returns a
numpy
array of residue sequence numbers, as provided by whatever data source the structure came from, so not necessarily consecutive, or starting from 1, etc. Read only.
-
ss_id
¶ numpy array of integer secondary structure ids.
-
strs
¶ Returns a numpy array of strings that encapsulates each residue’s name, sequence position, and chain ID in a readable form. Read only.
-
unique_ids
¶ A
numpy
array of integers. Multiple copies of the same residue in the collection will have the same integer value in the returned array. Read only.
-
ribbon_displays
¶ A numpy bool array whether to display each residue in ribbon style.
-
ribbon_colors
¶ A
numpy
Nx4 array of uint8 RGBA values. Can be set with such an array (or equivalent sequence), or with a single RGBA value.
-
unique_structures
¶ The unique structures as an
AtomicStructureDatas
collection
-
unique_chain_ids
¶ The unique chain IDs as a numpy array of strings.
-
get_polymer_spline
()¶ Return a tuple of spline center and guide coordinates for a polymer chain. Residues in the chain that do not have a center atom will have their display bit turned off. Center coordinates are returned as a numpy array. Guide coordinates are only returned if all spline atoms have matching guide atoms; otherwise, None is returned for guide coordinates.
-
-
class
Chains
(chain_pointers)¶ Bases:
Collection
Collection of C++ chain objects.
-
chain_ids
¶ A numpy array of string chain ids for each chain. Read only.
-
structures
¶ A
AtomicStructureDatas
collection containing structures for each chain.
-
num_residues
¶ A numpy integer array containing the number of residues in each chain.
-
-
class
AtomicStructureDatas
(mol_pointers)¶ Bases:
Collection
Collection of C++ atomic structure objects.
-
names
¶ A numpy string array of names of each structure.
-
num_atoms
¶ Number of atoms in each structure. Read only.
-
num_bonds
¶ Number of bonds in each structure. Read only.
-
num_chains
¶ Number of chains in each structure. Read only.
-
num_residues
¶ Number of residues in each structure. Read only.
-
pbg_maps
¶ Returns a list of dictionaries whose keys are pseudobond group categories (strings) and whose values are
Pseudobonds
. Read only.
-
metadata
¶ Return a list of dictionaries with metadata. Read only.
-