Bundle Example: Save a New File Format

This example describes how to create a ChimeraX bundle that allows ChimeraX to open and save data files in XYZ format, which is a simple format containing only information about atomic types and coordinates. The example files are almost identical to those from Bundle Example: Read a New File Format, with a few additions for saving XYZ files.

Code for both reading and writing a new format is typically supplied in the same bundle. However, an alternative is to have separate bundles for reading and writing, by making one bundle dependent on the other. For example, the base bundle can define the data format and handle open requests; the dependent bundle can then handle only save requests (using the same data format definition from the base bundle).

The steps in implementing the bundle are:

  1. Create a bundle_info.xml containing information about the bundle,

  2. Create a Python package that interfaces with ChimeraX and implements the file-reading functionality, and

  3. Install and test the bundle in ChimeraX.

The final step builds a Python wheel that ChimeraX uses to install the bundle. So if the bundle passes testing, it is immediately available for sharing with other users.

Source Code Organization

The source code for this example may be downloaded as a zip-format file containing a folder named tut_save. Alternatively, one can start with an empty folder and create source files based on the samples below. The source folder may be arbitrarily named, as it is only used during installation; however, avoiding whitespace characters in the folder name bypasses the need to type quote characters in some steps.

Sample Files

The files in the tut_save folder are:

  • tut_save - bundle folder
    • bundle_info.xml - bundle information read by ChimeraX

    • src - source code to Python package for bundle
      • __init__.py - package initializer and interface to ChimeraX

      • io.py - source code to read and save XYZ format files

The file contents are shown below.

bundle_info.xml

bundle_info.xml is an eXtensible Markup Language format file whose tags are listed in Bundle Information XML Tags. While there are many tags defined, only a few are needed for bundles written completely in Python. The bundle_info.xml in this example is similar to the one from the Bundle Example: Add a Tool example with changes highlighted. For explanations of the unhighlighted sections, please see Bundle Example: Hello World, Bundle Example: Add a Command, Bundle Example: Add a Tool, and Bundle Example: Read a New File Format.

 1<!--
 2ChimeraX bundle names must start with "ChimeraX-"
 3to avoid clashes with package names in pypi.python.org.
 4When uploaded to the ChimeraX toolshed, the bundle
 5will be displayed without the ChimeraX- prefix.
 6-->
 7
 8<BundleInfo name="ChimeraX-TutorialSaveFormat"
 9	    version="0.1" package="chimerax.tut_save"
10  	    minSessionVersion="1" maxSessionVersion="1">
11
12  <!-- Additional information about bundle source -->
13  <Author>UCSF RBVI</Author>
14  <Email>chimerax@cgl.ucsf.edu</Email>
15  <URL>https://www.rbvi.ucsf.edu/chimerax/</URL>
16
17  <!-- Synopsis is a one-line description
18       Description is a full multi-line description -->
19  <Synopsis>Example for reading and saving XYZ format files</Synopsis>
20  <Description>Example code for implementing ChimeraX bundle.
21
22Implements capability for reading and saving XYZ format files.
23  </Description>
24
25  <!-- Categories is a list where this bundle should appear -->
26  <Categories>
27    <Category name="General"/>
28  </Categories>
29
30  <!-- Dependencies on other ChimeraX/Python packages -->
31  <Dependencies>
32    <Dependency name="ChimeraX-Core" version="~=1.1"/>
33  </Dependencies>
34
35    <!-- Register XYZ format as one of the supported input file formats -->
36  <Providers manager="data formats">
37    <Provider name="XYZ" suffixes=".xyz" category="Molecular structure"
38		reference_url="https://en.wikipedia.org/wiki/XYZ_file_format"
39		encoding="utf-8" mime_types="chemical/x-xyz" />
40  </Providers>
41
42  <Providers manager="open command">
43    <Provider name="XYZ" />
44  </Providers>
45
46  <Providers manager="save command">
47    <Provider name="XYZ" />
48  </Providers>
49
50  <Classifiers>
51    <!-- Development Status should be compatible with bundle version number -->
52    <PythonClassifier>Development Status :: 3 - Alpha</PythonClassifier>
53    <PythonClassifier>License :: Freeware</PythonClassifier>
54  </Classifiers>
55
56</BundleInfo>

The BundleInfo, Synopsis and Description tags are changed to reflect the new bundle name and documentation (lines 8-10 and 17-23).

The Providers sections on lines 36 through 48 use the Manager/Provider protocol to inform the “data formats” manager about the XYZ format, and the “open command” and “save command” managers, respectively, that this bundle can open and save XYZ files,

The attributes usable with the “data formats” manager are described in detail in Defining a File/Data Format. Note that most formats have a longer official name than “XYZ” and therefore most formats will also specify nicknames and synopsis attributes, whereas they are unneeded in this example.

The “open command” attributes are described in detail in Opening Files. Likewise, the “save command” attributes are described in detail in Saving Files. It is typical that the only attribute specified is name.

src

src is the folder containing the source code for the Python package that implements the bundle functionality. The ChimeraX devel command, used for building and installing bundles, automatically includes all .py files in src as part of the bundle. (Additional files may also be included using bundle information tags such as DataFiles as shown in Bundle Example: Add a Tool.) The only required file in src is __init__.py. Other .py files are typically arranged to implement different types of functionality. For example, cmd.py is used for command-line commands; tool.py or gui.py for graphical interfaces; io.py for reading and saving files, etc.

src/__init__.py

As described in Bundle Example: Hello World, __init__.py contains the initialization code that defines the bundle_api object that ChimeraX needs in order to invoke bundle functionality. ChimeraX expects bundle_api class to be derived from chimerax.core.toolshed.BundleAPI with methods overridden for registering commands, tools, etc.

 1# vim: set expandtab shiftwidth=4 softtabstop=4:
 2
 3from chimerax.core.toolshed import BundleAPI
 4
 5
 6# Subclass from chimerax.core.toolshed.BundleAPI and
 7# override the method for opening and saving files,
 8# inheriting all other methods from the base class.
 9class _MyAPI(BundleAPI):
10
11    api_version = 1
12
13    # Implement provider methods for opening and saving files
14    @staticmethod
15    def run_provider(session, name, mgr):
16        # 'run_provider' is called by a manager to invoke the 
17        # functionality of the provider.  Since the "data formats"
18        # manager never calls run_provider (all the info it needs
19        # is in the Provider tag), we know that only the "open
20        # command" or "save command" managers will call this
21        # function, and customize it accordingly.
22        #
23        # The 'name' arg will be the same as the 'name' attribute
24        # of your Provider tag, and mgr will be the corresponding
25        # Manager instance
26        #
27        # For the "open command" manager, this method must return
28        # a chimerax.open_command.OpenerInfo subclass instance.
29        # For the "save command" manager, this method must return
30        # a chimerax.save_command.SaverInfo subclass instance.
31        #
32        # The "open command" manager is also session.open_command,
33        # and likewise the "save command" manager is
34        # session.save_command.  We therefore decide what to do
35        # by testing our 'mgr' argument...
36        if mgr == session.open_command:
37            from chimerax.open_command import OpenerInfo
38            class XyzInfo(OpenerInfo):
39                def open(self, session, data, file_name, **kw):
40                    # The 'open' method is called to open a file,
41                    # and must return a (list of models created,
42                    # status message) tuple.
43                    from .io import open_xyz
44                    return open_xyz(session, data)
45        else:
46            from chimerax.save_command import SaverInfo
47            class XyzInfo(SaverInfo):
48                def save(self, session, path, *, structures=None):
49                    # The 'save' method is called to save a file,
50                    # There is no return value.
51                    #
52                    # This bundle supports an optional 'structures'
53                    # keyword arument to the save command and
54                    # therefore will have the 'structures' argument
55                    # to this function provided with whatever
56                    # value the user supplied, if any.  The
57                    # 'save_args' property below informs the
58                    # "save command" manager of the optional
59                    # keywords and their value types that this
60                    # bundle supports.
61                    from .io import save_xyz
62                    save_xyz(session, path, structures)
63
64                @property
65                def save_args(self):
66                    # The 'save_args' property informs the
67                    # "save command" manager of any optional
68                    # bundle/format-specific keyword arguments
69                    # to the 'save' command that this bundle
70                    # supports.  If given by the user, they will
71                    # be provided to the above 'save' method.  If
72                    # there are no such keywords, you need not
73                    # implement this property.
74                    #
75                    # This property should return a dictionary
76                    # that maps the *Python* name of a keyword to
77                    # an Annotation subclass.  Annotation classes
78                    # are used to convert user-typed text into
79                    # Python values.
80                    from chimerax.atomic import StructuresArg
81                    return { 'structures': StructuresArg }
82
83        return XyzInfo()
84
85
86# Create the ``bundle_api`` object that ChimeraX expects.
87bundle_api = _MyAPI()

The run_provider() method is called by a ChimeraX manager when it needs additional information from a provider or it needs a provider to execute a task. The session argument is a Session instance, the name argument is the same as the name attribute in your Provider tag, and the mgr argument is the manager instance. These arguments can be used to decide what to do when your bundle offers several Provider tags, such as in this example. The “data formats” manager never calls run_provider(), so we only need to know if it’s the “open command” or “save command” manager calling this method. This “open command” manager is also session.open_command (and “save command” is session.save_command), so we use the test on line 36 to decide.

The information needed by the “open command” manager is returned by the code on lines 37-44 and is described in detail in Bundle Example: Read a New File Format.

When called by the “save command” manager, run_provider() must return an instance of a subclass of chimerax.save_command.SaverInfo. The methods of the class are thoroughly documented if you click the preceding link, but briefly:

  1. The save() method is called to actually save the file (and has no return value). The method’s path is the full path name of the file to save.

  2. If there are format-specific keyword arguments that the save command should handle, then a save_args() property should be implemented, which returns a dictionary mapping Python keyword names to Annotation subclasses. Such keywords will be passed to your save() method.

  3. If your underlying file-writing function uses open_output() to open the path, then compression implied by the file name (e.g. a additional .gz suffix) will be handled automatically.

  4. In the rare case where you save a file type that ChimeraX knows how to open but would be inappriate to open for some reason, set in_file_history to False to exclude it from the file history listing.

src/io.py

  1# vim: set expandtab shiftwidth=4 softtabstop=4:
  2
  3
  4def open_xyz(session, stream):
  5    """Read an XYZ file from a file-like object.
  6
  7    Returns the 2-tuple return value expected by the
  8    "open command" manager's :py:meth:`run_provider` method.
  9    """
 10    structures = []
 11    line_number = 0
 12    atoms = 0
 13    bonds = 0
 14    while True:
 15        s, line_number = _read_block(session, stream, line_number)
 16        if not s:
 17            break
 18        structures.append(s)
 19        atoms += s.num_atoms
 20        bonds += s.num_bonds
 21    status = ("Opened XYZ file containing %d structures (%d atoms, %d bonds)" %
 22              (len(structures), atoms, bonds))
 23    return structures, status
 24
 25
 26def _read_block(session, stream, line_number):
 27    # XYZ files are stored in blocks, with each block representing
 28    # a set of atoms.  This function reads a single block
 29    # and builds a ChimeraX AtomStructure instance containing
 30    # the atoms listed in the block.
 31
 32    # First line should be an integer count of the number of
 33    # atoms in the block.
 34    count_line = stream.readline()
 35    if not count_line:
 36        # Reached EOF, normal termination condition
 37        return None, line_number
 38    line_number += 1
 39    try:
 40        count = int(count_line)
 41    except ValueError:
 42        session.logger.error("line %d: atom count missing" % line_number)
 43        return None, line_number
 44
 45    # Create the AtomicStructure instance for atoms in this block.
 46    # All atoms in the structure are placed in one residue
 47    # since XYZ format does not partition atoms into groups.
 48    from chimerax.atomic import AtomicStructure
 49    from numpy import array, float64
 50    s = AtomicStructure(session)
 51    residue = s.new_residue("UNK", 'A', 1)
 52
 53    # XYZ format supplies the atom element type only, but
 54    # ChimeraX keeps track of both the element type and
 55    # a unique name for each atom.  To construct the unique
 56    # atom name, the # 'element_count' dictionary is used
 57    # to track the number of atoms of each element type so far,
 58    # and the current count is used to build unique atom names.
 59    element_count = {}
 60
 61    # Next line is a comment line
 62    s.comment = stream.readline().strip()
 63    line_number += 1
 64
 65    # There should be "count" lines of atoms.
 66    for n in range(count):
 67        atom_line = stream.readline()
 68        if not atom_line:
 69            session.logger.error("line %d: atom data missing" % line_number)
 70            return None, line_number
 71        line_number += 1
 72
 73        # Extract available data
 74        parts = atom_line.split()
 75        if len(parts) != 4:
 76            session.logger.error("line %d: atom data malformatted"
 77                                 % line_number)
 78            return None, line_number
 79
 80        # Convert to required parameters for creating atom.
 81        # Since XYZ format only required atom element, we
 82        # create a unique atom name by putting a number after
 83        # the element name.
 84        xyz = [float(v) for v in parts[1:]]
 85        element = parts[0]
 86        n = element_count.get(element, 0) + 1
 87        name = element + str(n)
 88        element_count[element] = n
 89
 90        # Create atom in AtomicStructure instance 's',
 91        # set its coordinates, and add to residue
 92        atom = s.new_atom(name, element)
 93        atom.coord = array(xyz, dtype=float64)
 94        residue.add_atom(atom)
 95
 96    # Use AtomicStructure method to add bonds based on interatomic distances
 97    s.connect_structure()
 98
 99    # Updating state such as atom types while adding atoms iteratively
100    # is unnecessary (and generally incorrect for partial structures).
101    # When all atoms have been added, the instance is notified to
102    # tell it to update internal state.
103    s.new_atoms()
104
105    # Return AtomicStructure instance and current line number
106    return s, line_number
107
108
109def save_xyz(session, path, structures=None):
110    """Write an XYZ file from given models, or all models if None.
111    """
112    # Open path with proper encoding; 'open_output' automatically
113    # handles compression if the file name also has a compression
114    # suffix (e.g. .gz)
115    from chimerax.io import open_output
116    f = open_output(path, session.data_formats['XYZ'].encoding)
117
118    # If no models were given, use all atomic structures
119    if structures is None:
120        from chimerax.atomic import AtomicStructure
121        structures = session.models.list(type=AtomicStructure)
122    num_atoms = 0
123
124    # Loop through structures and print atoms
125    for s in structures:
126        # We get the list of atoms and transformed atomic coordinates
127        # as arrays so that we can limit the number of accesses to
128        # molecular data, which is slower than accessing arrays directly
129        atoms = s.atoms
130        coords = atoms.scene_coords
131
132        # First line for a structure is the number of atoms
133        print(str(s.num_atoms), file=f)
134        # Second line is a comment
135        print(getattr(s, "name", "unnamed"), file=f)
136        # One line per atom thereafter
137        for i in range(len(atoms)):
138            a = atoms[i]
139            c = coords[i]
140            print("%s %.3f %.3f %.3f" % (a.element, c[0], c[1], c[2]), file=f)
141        num_atoms += s.num_atoms
142    f.close()
143
144    # Notify user that file was saved
145    session.logger.status("Saved XYZ file containing %d structures (%d atoms)"
146                          % (len(structures), num_atoms))
147    # No return value

The open_xyz and _read_block functions are described in detail in Bundle Example: Read a New File Format.

The save_xyz function performs the following steps:

  • open the output file for writing using the ChimeraX function open_output() (lines 112-116),

  • if the structures keyword was not given, include all atomic structures for saving (lines 118-121),

  • initialize the total atom count (line 122),

  • loop through the structures to save (line 125) and:

    • get the lists of atoms and coordinates for the structure. (lines 129-130),

    • print the first two lines (number of atoms and comment) for the structure to the file (lines 132-135),

    • print one line per atom using the atom and coordinates lists, and

    • update total atom count (lines 136-141).

  • close the output file (line 142)

  • finally, log a status message to let the user know what was written (lines 144-146).

Building and Testing Bundles

To build a bundle, start ChimeraX and execute the command:

devel build PATH_TO_SOURCE_CODE_FOLDER

Python source code and other resource files are copied into a build sub-folder below the source code folder. C/C++ source files, if any, are compiled and also copied into the build folder. The files in build are then assembled into a Python wheel in the dist sub-folder. The file with the .whl extension in the dist folder is the ChimeraX bundle.

To test the bundle, execute the ChimeraX command:

devel install PATH_TO_SOURCE_CODE_FOLDER

This will build the bundle, if necessary, and install the bundle in ChimeraX. Bundle functionality should be available immediately.

To remove temporary files created while building the bundle, execute the ChimeraX command:

devel clean PATH_TO_SOURCE_CODE_FOLDER

Some files, such as the bundle itself, may still remain and need to be removed manually.

Building bundles as part of a batch process is straightforward, as these ChimeraX commands may be invoked directly by using commands such as:

ChimeraX --nogui --exit --cmd 'devel install PATH_TO_SOURCE_CODE_FOLDER exit true'

This example executes the devel install command without displaying a graphics window (--nogui) and exits immediately after installation (exit true). The initial --exit flag guarantees that ChimeraX will exit even if installation fails for some reason.

Distributing Bundles

With ChimeraX bundles being packaged as standard Python wheel-format files, they can be distributed as plain files and installed using the ChimeraX toolshed install command. Thus, electronic mail, web sites and file sharing services can all be used to distribute ChimeraX bundles.

Private distributions are most useful during bundle development, when circulation may be limited to testers. When bundles are ready for public release, they can be published on the ChimeraX Toolshed, which is designed to help developers by eliminating the need for custom distribution channels, and to aid users by providing a central repository where bundles with a variety of different functionality may be found.

Customizable information for each bundle on the toolshed includes its description, screen captures, authors, citation instructions and license terms. Automatically maintained information includes release history and download statistics.

To submit a bundle for publication on the toolshed, you must first sign in. Currently, only Google sign in is supported. Once signed in, use the Submit a Bundle link at the top of the page to initiate submission, and follow the instructions. The first time a bundle is submitted to the toolshed, it is held for inspection by the ChimeraX team, which may contact the authors for more information. Once approved, all subsequent submissions of new versions of the bundle are posted immediately on the site.

What’s Next