pdb: Read and write PDB files¶
Directly calling the routines below would typically be about your third choice for reading/writing PDB files. The first choice would be to execute the equivalent ChimeraX command directly with code like:
from chimerax.core.commands import run opened_models = run(session, "open /path/to/file.pdb")
from chimerax.core.commands import run run(session, "save /path/to/file.pdb #1")
This approach has the advantage of not needing to find equivalent API calls to every command you want to execute, or figuring out what arguments are needed for those calls. This approach is discussed in the Pro Tip section near the top of the Developer Tutorial.
The second approach would be to use the open- or save-command managers. Those managers know which bundles provide support for opening/saving various file formats, and provide a generic interface for opening/saving files, e.g.:
models, status_message = session.open_command.open_data("/path/to/file.pdb", [other Python format-specific keywords])
session.save_command.save_data("/path/to/file.pdb", models=[model1], [other Python format-specific keywords])
This second approach has the disadvantage that the values for the keywords may not be obvious in some cases (i.e. you would have to look at the underlying API). Also, the models returned by
open_data() have not been added to the session. Details like this are discussed in the Python Functions implementing User Commands documentation, under open and save.
And finally, the third approach would be to call the PDB-saving API directly. Something like:
from chimerax.pdb import open_pdb models, status_message = open_pdb(session, "/path/to/file.pdb")
from chimerax.pdb import save_pdb save_pdb(session, "/path/to/file.pdb", models=[model1])
The only advantage of this third approach is in the rare case where you need to use an esoteric Python-only keyword argument that isn’t supported in the equivalent open/save command. For instance,
save_pdb() has a
polymeric_res_names argument for when you need to output residues in ATOM records that would otherwise be output as HETATM records (this capability is used by the modeller bundle).
pdb: PDB format support¶
Read Protein DataBank (PDB) files.
- open_pdb(session, stream, file_name=None, *, auto_style=True, coordsets=False, atomic=True, max_models=None, log_info=True, combine_sym_atoms=True, segid_chains=False, slider=True, missing_coordsets='renumber')¶
Experimental API . Read PDB data from a file or stream and return a list of models and status information.
streamis either a string a string with a file system path to a PDB file, or an open input stream to PDB data.
file_nameis the name to give to the resulting model(s). Typically only needed if the input is an anonymous stream or the input file name wouldn’t be a good model name.
coordsetscontrols whether a multi-MODEL PDB is opened as a list of structures or as a single structure with multiple coordinate sets.
atomiccontrols whether AtomicStructure or Structure is used as the class for the structure. The latter should be used for PDB files that don’t actually contain atomic data per se, like SAX “PDB” files or coarse-grain models.
max_modelslimits the number of models this routine can return.
combine_sym_atomscontrols whether otherwise identical atoms with no bonds that are also very close together in space should be combined into a single atom.
segid_chainscontrols whether the chain ID should come from the normal chain ID columns or from the “segment ID” columns.
slidercontrols whether a slider tool is shown when a multi-model PDB file is opened as a trajectory.
missing_coordsetsis for the rare case where MODELs are being collated into a trajectory and the MODEL numbers are not consecutive. The possible values are ‘fill’ (fill in the missing with copies of the preceding coord set), ‘ignore’ (don’t fill in; use MODEL number as is for coordset ID), and ‘renumber’ (don’t fill in and use the next available coordset ID).
- save_pdb(session, output, *, models=None, selected_only=False, displayed_only=False, all_coordsets=False, pqr=False, rel_model=None, serial_numbering='h36', polymeric_res_names=None)¶
Experimental API . Write PDB data to a file.
outputis a file system path to a writable location. It can contain the strings “[NAME]” and/or “[ID]”, which will be replaced with the model’s name/id, respectively.
modelsis a list of models to output. If not specified, all structure models will be output.
selected_onlycontrols whether only currently selected atoms should be output.
displayed_onlycontrols whether only currently displayed atoms should be output.
all_coordsetscontrols whether or not, for a multi-coordset model, all coordsets should be written out (using MODEL records) or just the current coordinate set.
pqrcontrols whether ATOM/HETATM records will be written out in non-standard PQR format or not.
rel_modelif given, is a model that the output coordinates should be written “relative to”, i.e. whatever then inverse of
relmodel’s current transformation is, apply that to the atomic coordinates before outputting them.
serial_numberingcontrols how serial numbers are output when they would exceed PDB column limits. “h36” means to use Hybrid-36 numbering. “amber” means to steal a column from the “ATOM ” record and not correct them in other types of records (e.g. CONECT).
polymeric_res_namesis a list of residue names that should be considered “standard” as far as the the output of ATOM vs. HETATM goes. If not specified, the residue names that the RCSB considers standard will be used.