Copyright © 1996 by the Regents of the University of California.

PDBio Interface

The PDBio interface provides the base level functionality for the PDBio class. The PDBio class supports reading and write of Brookhaven Protein DataBank (PDB) files and must be linked with the PDB library. The PDBio interface depends on the Molecule, CoordSet, Coord, Residue, Atom, and Bond interfaces. And it uses the PDBio helper functions from the Molecule Toolkit support library and a residue template library.

Note: Multiple models in a PDB file are interpreted to mean that multiple coordinate sets are to be created. Multiple molecules in a single file must be separated by PDB END records.

Atom Member Functions

char altLoc() const
Get an Atom's alternate location identifier.
void setAltLoc(char al)
Set an Atom's alternate location identifier.

Residue Member Functions

Atom *findAtom(Symbol name, char altLoc) const
Version of findAtom that uses the alternate location for an exact match. If there are no alternate atoms, then the non-alternate atom is returned.

PDBio Member Constants

ATOMS
Use this constant to set a bit in the whatToDo boolean vector indicating that PDB ATOM records are to be processed.
COVALENT_BONDS
Use this constant to set a bit in the whatToDo boolean vector indicating that covalent bonds from PDB CONECT records are to be processed.
HYDROGEN_BONDS
Use this constant to set a bit in the whatToDo boolean vector indicating that hydrogen bonds from PDB CONECT records are to be processed.
SALT_BRIDGES
Use this constant to set a bit in the whatToDo boolean vector indicating that salt bridges from PDB CONECT records are to be processed.
COMMENTS
Use this constant to set a bit in the whatToDo boolean vector indicating that PDB HEADER, SOURCE, COMPND, AUTHOR and JRNL records are to be processed.
ALL_CONECT
Use this constant to set a bit in the whatToDo Bitset indicating that PDB CONECT records should be generated for all residues, not just the non-standard ones.

PDBio Member Functions

PDBio()
PDBio(const vector<bool> &what)
PDBio(const vector<bool> &what, const vector<bool> &readMask, const vector<bool> &writeMask)
Constructors. The default constructor set the ATOMS and COVALENT_BONDS whatToDo bits. The second constructor changes those defaults. The third constructor initializes the read and write masks for processing PDB records.
void readPDBfile(const char *filename)
Read the file (standard input if NULL) into a new Molecule(s).
int readPDBstream(istream &is, const char *filename, int firstLine)
Same as readPDBfile except from an istream. The filename and firstLine arguments are used for errors. The line number of the last line read is returned. A PDB END record or an end of file condition terminates this function.
void writePDBfile(const char *filename)
Write the Molecule(s) into the given file (standard output if NULL).
void writePDBstream(ostream &os, const char *filename)
Same as writePDBfile except that the output is written on the given ostream.
const bool ok()
Return true if last file operation was successful.
const char *error()
Return an explanation of a failed file operation.
Molecule *molecule() const
Return the first resulting Molecule from a readPDBfile().
const Molecules &molecules() const
Return the list of Molecules from a readPDBfile(). This function is generated by genlib.
void setMolecule(Molecule *mol)
Set the Molecule for a writePDBfile().
void setMolecules(const Molecules mols)
Set the list of Molecules for a writePDBfile().
vector<bool> &whatToDo()
Return the boolean vector that describes the parts of the PDB file you wish to be read in or written out. The Member Constants section above lists the various bit indices.
vector<bool> &readMask()
Return the boolean vector that describes the PDB records you wish to process via the readPDB() function. The bit indices are from the PDB::recordType enumerated type.
void setReadPDB(bool (*rf)(PDB *p, Molecule *m, const map<int, Atom *, less<int> > *serials))
Set the function that is called to process a given PDB record before readPDBfile() does. If the function returns true, then readPDBfile() will ignore the PDB record entirely. The arguments to the given function, rf are the PDB record, the molecule being read in, and the current mapping of atom serial numbers to Atom pointers.
void setPostprocessAtom(void (*pf)(PDB *p, Molecule *m, Residue *r, Atom *a, const map<int, Atom *, less<int> > *serials))
Set the function that is called after an ATOM/HETATM PDB record has been read in, the corresponding Atom object has been created, and the serial number assigned. The arguments to the given function, pf, are the PDB record, the molecule being read in, the current residue, the current atom, and the current mapping of atom serial numbers to Atom pointers.
bool (*readPDB())(PDB *, Molecule *, const map<int, Atom *, less<int> > *)
Return the function set by the previous readPDB function.
vector<bool> &writeMask()
Return the boolean vector that describes the PDB records you wish to process via the writePDB() function. The bit indices are from the PDB::recordType enumerated type.
void setWritePDB(bool (*w)(ostream &, PDB *, Molecule *, Atom *, const map<Atom *, int, less<Atom *> > *))
Set the function that is called to manipulate a given PDB record before writePDBfile() outputs it. If it returns true, then writePDBfile() will not write the PDB record. The arguments to the given function are the output stream, the PDB record, the molecule being written, and the current mapping of Atom pointers to atom serial numbers.
bool (*writePDB())(ostream &, PDB *, Molecule *, Atom *, const map<Atom *, int, less<Atom *> > *)
Return the function set by the previous writePDB function.
(And genlib generated functions: molecules(), findMolecule(), etc.)

PDBio Static Functions

void bondLengthTolerance(float t)
Set the bond length tolerance used to connect atoms in residues that do not have a template nor CONECT records. The default value is .22 angstroms.
float bondLengthTolerance()
Return the bond length tolerance.
bool standardResidue(const String &type)
bool standardResidue(Symbol type)
Return true if argument is in set of standard PDB residue names. Standard residues have ATOM records instead of HETATM records (water, H2O or WAT, is not standard in this sense). The standard residues are the three letter names for the 20 amino acids and the five single letter nucleic acids.
void addStandardResidue(const String &name)
Add the given name to the set of standard PDB residue names.
void removeStandardResidue(const String &name)
Remove the given name from the set of standard PDB residue names.

Bugs

Doesn't support PDB 2.0 atom charges.

WritePDBfile needs some work (aligning atom names correctly, removing the *'s added by readPDBfile, output in PDB order, etc.).

The COMMENTS bit index doesn't do anything yet.

Adjacent residues whose linking atoms have alternate atom locations are only connected once.

Example

int
main(int argc, char **argv)
{
  PDBio pdb_io;

  pdb_io.whatToDo()[PDBio::HYDROGEN_BONDS] = 1;
  pdb_io.readPDBfile(argv[1]);
  if (!pdb_io.ok()) {
    cerr << pdb_io.error() << '\n';
    return 1;
  }
  process_molecule(pdb_io.molecule());
  return 0;
}

Implementations

PDBio_default
The PDBio_default implementation uses a multimap indexed by Symbol to hold Atom pointers in a Residue, a map indexed by MolResId to hold Residue pointers in a Molecule, a back pointer from Atom to Residue, and a list of Molecule pointers in a PDBio.

Greg Couch, UCSF Computer Graphics Laboratory