Andrej Sali¹, Tom Goddard², Greg Couch², and Conrad Huang²
¹
California Institute for Quantitative Biometical Research (QB3)
University of California, San Francisco
²
Resource for Biocomputing, Visualization, and Informatics
University of California, San Francisco
Background
The Sali group at UCSF (http://salilab.org) is using computation grounded in the laws of physics and evolution to study the structure and function of proteins. We aim to improve and apply methods for: (i) predicting the structures of proteins; (ii) determining the structures of macromolecular assemblies; (iii) annotating the functions of proteins using their structures. Most of our methods are implemented in our software package MODELLER (http://salilab.org/modeller). This research contributes to structure-based functional annotation of proteins and thus enhances the impact of genome sequencing, structural genomics, and functional genomics on biology and medicine. The visualization by CHIMERA of sequences, structures, and alignments of individual proteins and their assemblies is a key tool in the development of our methods as well as in the analysis and presentation of our results. The powerful capabilities of CHIMERA are especially helpful for complex manipulations that include additional information, such as dynamic trajectories and mass density maps from cryo-electron microscopy. CHIMERA is our primary molecular visualization package. More recently, we began to collaborate with Tom Ferrin and Wah Chiu (Baylor College) to integrate the EMAN package for cryo-electron microscopy data processing, CHIMERA for visualization, and our program MODELLER for model building of sequences restrained by related known structures and electron microscopy maps. The integrated system will greatly increase the productivity of electron microscopists as well as the quality of their results. We further illustrate our interactions with the RBVI by more detailed descriptions of two specific collaborations. CHIMERA for Displaying the Contents of MODBASE
MODBASE is a comprehensive database of annotated comparative protein structure models for all known protein sequences that are detectably related to at least one known protein structure [1,2]. The database is freely accessible to the academic community through a web-interface at (http://salilab.org/modbase). MODBASE contains ~3 million models for domains of 1.3 million unique protein sequences, in addition to the corresponding fold assignments, sequence-structure alignments, model assessments, information about putative ligand binding sites, and single point mutations. MODBASE is bidirectionally linked with a variety of major biological databases, including Uniprot at EBI and Human Genome Browser at UCSC. We collaborated with the RBVI to allow users of MODBASE to visualize the alignments and annotated models directly from the MODBASE interface [1]. To achieve this goal, we created an extension to CHIMERA. The data contained in a MODBASE entry are divided among three different files: a template structure file, a model file, and an alignment file. Manually downloading and opening these files with visualization tools can be a cumbersome process. The new CHIMERA extension enables a web browser to communicate directly with CHIMERA. Clicking on a single link associated with each MODBASE model triggers CHIMERA to start on your local computer. Information related to the model is transmitted to CHIMERA via a registered MIME (Multipurpose Internet Mail Extensions) file type, which then displays the structures of the template and the model; their alignment is also displayed using CHIMERA's multiple sequence alignment viewer, MultAlign Viewer. The user can then apply CHIMER's rich set of visualization and analysis tools to further study the model. Additionally, for models with associated point mutations and putative ligand binding sites, the relevant residues are automatically highlighted in CHIMERA. In the future, MODBASE will contain models based on multiple templates. We plan to adapt the MODBASE-CHIMERA interface to display the complex multiple alignments in a most user-friendly way. CHIMERA as a Graphical Interface in the Modeling Process
CHIMERA has recently also been used as a graphical interface for the modeling of loops in comparative modeling projects, as well as restrained flexible fitting of comparative models into cryo-electron microscopy maps. There are currently many more known protein sequences than there are protein structures, and thus the determination of protein structure from sequence by comparative or homology modeling is of great interest. The MODELLER package is commonly used for this purpose (~11,000 different users have downloaded the package so far) [3]. Two particular areas of current interest are the refinement of protein loops using statistical potentials [4], and the use of additional sources of data, such as from cryo-electron microscopy experiments [5-7]. While MODELLER is a powerful package, it can be hard to use effectively without visual feedback, for instance to see the fit between a density map and a protein model, or to see the configuration of a set of loops. An extension to CHIMERA was created to simplify the use of MODELLER for loop modeling. Given a starting protein structure, the user is able to graphically select a set of residues for further loop refinement. The inputs for MODELLER are then synthesized and a number of candidate loop models built. These can then be viewed and ranked within CHIMERA. In the future, we plan to link MODELLER and Chimera more closely to make the loop modeling process itself more interactive (such that the user can terminate obviously bad loops, 'guide' the optimization by manually perturbing the structure, or adjust the parameters). In addition to loop modeling, this same approach will be applicable to the EM density fitting tools in MODELLER (Mod-EM), and can improve the procedure by which density maps are used to improve the accuracy of comparative models. This exciting project nicely meshes with the visualization tools that the RBVI has been developing for the cryoEM community.References:
- U. Pieper, N. Eswar, H. Braberg, M.S. Madhusudhan, F.P. Davis, A.C. Stuart, N. Mirkovic, A. Rossi, M.A. Marti-Renom, A. Fiser, B. Webb, D. Greenblatt, C.C. Huang, T.E. Ferrin, and A. Sali. "MODBASE, A Database of Annotated Comparative Protein Structure Models, and Associated Resources," Nuc. Acids Res. 32, D217-D222, 2004.
- U. Pieper, N. Eswar, F.P. Davis, H. Braberg, M.S. Madhusudhan, A. Rossi, M. Marti-Renom, R. Karchin, B.M. Webb, D. Eramian, M.Y. Shen, L. Kelly, F. Melo, A. Sali. "MODBASE, A database of annotated comparative protein structure models and associated resources," Nuc. Acids Res. 34, D291-D295, 2006.
- A. Sali, T.L. Blundell. "Comparative protein modelling by satisfaction of spatial restraints," J. Mol. Biol. 234, 779-815, 1993.
- A. Fiser, R.K. Do, A. Sali. "Modeling of loops in protein structures" Protein Sci 9, 1753-1773, 2000.
- M. Topf, M.L. Baker, B. John, W. Chiu, A. Sali. "Structural Characterization of Components of Protein Assemblies by Comparative Modeling and Electron Cryo-Microscop," J. Struct. Biol. 149, 191-203, 2005.
- M. Topf, A. Sali. "Combining Electron Microscopy and Comparative Protein Structure Modeling," Current Opinion in Structural Biology 15, 578-585, 2005.
- M. Topf, M.L. Baker, M.A. Marti-Renom, W. Chiu, A. Sali. "Refinement of Protein Structures by Iterative Comparative Modeling and CryoEM Density Fitting," J. Mol. Biol. 357, 1655-1668, 2006.
Laboratory Overview | Research | Outreach & Training | Available Resources | Visitors Center | Search