chemViz: Cheminformatics Plugin for Cytoscape
Figure 1. chemViz in action. This example shows a portion of a network of predicted hits for a chemical assay. A 2D Structure Table has been generated for the selected nodes, and the number of hydrogen bond acceptors and donors for the compounds have been calculated and added to the table. Larger images of two of the structures are shown. The calculated hydrogen bond acceptors and donors were mapped to Cytoscape attributes by chemViz and used to set the node color and node border color in the network.
UCSF chemViz is a Cytoscape plugin that extends the capabilities of Cytoscape into the domain of cheminformatics. chemViz displays 2D diagrams of compounds specified by InCHI or SMILES strings. chemViz can also calculate Tanimoto similarities of compounds and use the values to create chemical similarity networks. Part of such a network is shown above. The 2D diagrams can be presented as scalable independent windows or as part of a table also showing Cytoscape attributes and calculated compound descriptors, including number of hydrogen bond donors, number of hydrogen bond acceptors, molecular weight, ALogP, molecular refractivity, number of Rule of Five violations, and several more. Any of the calculated descriptors can be mapped onto Cytsocape attributes where they can be used by the VizMapper and saved with the session. In the network above, nodes are colored by the number of hydrogen bond acceptors and node borders are colored by the number of hydrogen bond donors. chemViz depends on version 2.6.1 of Cytoscape and is available from the Cytoscape plugin manager or at http://www.rbvi.ucsf.edu/cytoscape/chemViz.
Instructions
Installation
chemViz is available through the Cytoscape plugin manager or by downloading the source directly from the Cytoscape svn repository (see Cytoscape Subversion Server information, or browse the csplugins/ucsf/scooter/chemViz sources). To download chemViz using the plugin manager, you must be running Cytoscape 2.6.1 or newer. chemViz is available in the Analysis group of plugins. To install it, bring up the Manage Plugins dialog (Plugins→Manage Plugins) and select Analysis under Available for Install. Select chemViz and click the Install button.
Figure 2. The ChemViz Settings Dialog. This dialog allows users to customize the settings used by chemViz for various cutoffs and settings
Settings
The first step in using chemViz is to adjust the settings to correspond to your network attributes. By default chemViz will look for SMILES strings in the Cytoscape attributes: SMILES, Smiles, smiles, Compounds, or Compound. InCHI strings will be searched for in the attributes: InCHI, inchi, InChi, or InChI. These attributes may contain Cytoscape lists or comma-separated values. Either of these settings can be overridden through the Settings... dialog (see Figure 2). The Settings... dialog can also be used to change the default cutoffs for creating similarity edges and restricting the number of compounds to show in a single 2D popup. Each of the settings is discussed briefly below.
- Maximum number of compounds to show in 2D structure popup
- chemViz has two ways of displaying the 2D structures corresponding to SMILES or InCHI strings. For multiple nodes or edges or for nodes and edges with large numbers of compounds, the easiest way to view the compounds is with a table that includes not only a 2D representation of the compound, but also information about the node or edge associated with the compound or calculated chemical descriptors such as the molecular weight. The other way to display compound structures is as a small popup with just the selected structures displayed. If the number of structures is large, this popup can be very slow and the structures so small as to be unusable. The value in this field is used to limit the number of 2D structures included in a popup.
- Minimum tanimoto value to consider for edge creation
- When using chemViz to create a new network or new edges based on the similarity between two compounds it is customary to choose a reasonable minimum value to consider for the creation of an edge between two compounds since drawing an edge between two dissimilar compounds may not be useful for either analytical or visualization purposes.
- Attributes that contain SMILES strings
- Select the list of attributes that chemViz will use to search for SMILES strings. Node or edge attributes can be selected from the list. This is a multiple-selection dialog, so multiple attributes can be selected by holding down the Control key.
- Attributes that contain InCHI fingerprints
- Select the list of attributes that chemViz will use to search for InCHI strings. Node or edge attributes can be selected from the list. This is a multiple-selection dialog, so multiple attributes can be selected by holding down the Control key.
Figure 3. The 2D Structures Popup showing six structures from a node in a Cytoscape network. By resizing the popup frame, users can scale the structural representations.
Showing 2D Structures
As mentioned above, there are two ways to show the 2D representation of a chemical compound using chemViz: the 2D structures popup and a 2D structure table. Visualizing the 2D structures using a popup is only available from the node or edge context menu. Each of these approaches is discussed below.
2D Structures Popup
The 2D structures popup may be displayed for any node or edge with either SMILES or InCHI attributes using the edge or node context menu: Cheminformatics Tools→Depict 2D Structure→Show 2D structure(s) from this node(or edge). This will bring up a dialog with 2D representations for all of the compounds described by the SMILES or InCHI strings associated with that node or edge. The popup is resizable and the 2D structure representations will scale to match the size of the popup. Figure 3 shows the result of requesting the 2D structures popup for a node with 6 structures annotated.
In additional to using the context menu, the 2D structure popup is available by double-clicking on a 2D structure in the 2D structure table (see below).
Figure 4. The 2D Structure Table showing six structures from a node in a Cytoscape network. By resizing the popup frame, users can scale the structural representations.
2D Structure Table
The most flexible way to display 2D structures and corresponding attributes and descriptors is through the chemViz 2D Structure Table. This dialog displays a table which can include Cytoscape attributes, molecular descriptors, and the 2D depiction of a compound. A 2D Structure Table may be displayed for single node or edge, a group of nodes or edges, or all of the nodes or edges in the network. The 2D Structure Table may be displayed for a single node (or edge) or the currently selected set of nodes or edges using the node or edge context menu: Cheminformatics Tools→Depict 2D Structure→Show table of compounds from selected nodes(or edges). They can also be displayed using the main plugin menu: Plugins→Cheminformatics Tools→Depict 2D Structure→Show table of compounds from selected nodes(or edges) or Plugins→Cheminformatics Tools→Depict 2D Structure→Show table of compounds from all nodes(or edges). Using any of these menus will bring up a table with default columns: ID - the ID of the node or edge, Attribute - the Cytoscape attribute used to retrieve the SMILES or InCHI string, Molecular String - the SMILES or InCHI string, Molecular Wt. - the molecular weight of the compound, and 2D Image - the 2D depiction of the compound. As with the 2D structures popup discussed above, the table may be resized as can the individual columns in the table. Columns may be reordered by dragging the column headers, and clicking on a column will cause it to sort the table based on the values in that column (clicking again will reverse the sort order, and a third click will remove the sort). Double-clicking on a single 2D image will popup a 2D structure popup with only that structure.
A 2D Structure Table may be customized further by right-clicking on any of the column headers. This will bring up a context menu for that column which allows users to remove the column from the table (Remove Column), or by adding a new column using data from corresponding Cytoscape attributes (Add New Column→Cytoscape attributes→) or calculated molecular descriptors (Add New Column→Molecular descriptors→). This capability allows molecular descriptors, cytoscape attributes and 2D depictions of the structures to be displayed in a table, sorted, and compared. Selecting any row in the table will select the corresponding node or edge. Similarly, selecting any node or edge that is represented in the table will select the corresponding rows in the table.
At the bottom of the 2D Structure Table are three buttons:
- Close:
- Closes the table, although the compound information will remain cached to speed further access
- Export Table...:
- Exports the contents of the table to a comma-separated text file. At this point, the 2D Image column can not be exported
- Print Table...:
- Provides the capability of printing the contents of the table (including the 2D Image column)
Calculating Molecular Descriptors
chemViz uses the open-source Chemistry Development Kit (CDK) for 2D depictions and calculating molecular descriptors for the compounds. By default, CDK uses 1024 bit standard hashed fingerprints that ignore cyclic systems, and at this point, chemViz just uses the default fingerprinting mechanism. Other fingerprints are possible with CDK, but the default fingerprints have been shown to be adequate for most purposes. CDK provides a large number of molecular descriptors, some of which can be calculated directly from the SMILES/InCHI (and resulting fingerprints) and some of which require conversion of the compound into a three-dimensional structure. This conversion can be computationally expensive and error-prone if the appropriate templates are not available. For that reason, chemViz will only calculate the molecular descriptors described below:
- Exact Mass
- The total exact mass of the molecule, assuming the "standard" isotope for each element.
- ALogP
- The 1-octanol/water partition coefficient, logP (calculated following the Ghose and Crippen (1986) LOGKow algorithm)
- ALogP2
- This is the square of the ALogP value - i.e. ALogP2.
- Molar refractivity
- The molar refractivity of the compound following the Ghose and Crippen (1987) method
- HBond Acceptors
- The number of possible hydrogen bond acceptors in this compound
- HBond Donors
- The number of possible hydrogen bond donors in this compound
- Rotatable Bonds Count
- The number of rotatable bonds in this compound
- Rule of Five Failures
- The number of Lapinski "Rule of Five" failures calculated for the structure.
- Topological Polar Surface Area
- The 2D estimated tpological polar surface area based on fragment contributions (TPSA).
- Wiener Path
- The Wiener path number: half the sum of all atom distances in the structure.
- Wiener Polarity
- The number of 3 bond length distances in the molecule
As mentioned above, chemViz can be used to add values for molecular descriptors to
a 2D Structure Table by using the
Add New Column→Molecular descriptors→
context menu that is available on the column headers. In addition, the node or edge context menus and
the Plugins→Cheminformatics Tools menu contain a Create
attributes from molecular descriptors menu.
< A common task for cheminformatics tools is to calculate the similarity of two compounds. The usual
mechanism to doing this is calculating the Tanimoto coefficients between the two compounds, which is a
measure of the similarity of the two compounds based on the angle between the attribute vectors
(fingerprint) of each compound. Thus this measure is dependent on the specific fingerprint descriptor
used. Common descriptors are MACCS, PubChem, and Daylight. The CDK used a 1024 bit hashed fingerprint,
which ignores cyclic systems. chemViz provides
In addition, both the node or edge context menus
and the Plugins→Cheminformatics Tools menu contain a Calculate
Similarity→Tanimoto Coefficients submenu. If no nodes are selected, the Tanimoto coefficients for all
nodes are calculated and a new network is generated with an edge between all node pairs where the Tanimoto
coefficient is larger than the
Minimum tanimoto value to consider for edge creation setting from the Settings Dialog.
If more than one node is selected the Tanimoto Coefficients menu becomes a submenu
with two options: All Nodes and Selected nodes only.
The All Nodes option creates a new network as described above. If the
Selected nodes only option is selected, the Tanimoto coefficients for all of the
selected nodes that are connected by edges are calculated and a new
TanimotoSimilarity edge attribute is created which contains the new values.
Calculating Molecular Similarity
Laboratory Overview | Research | Outreach & Training | Available Resources | Visitors Center | Search
