Copyright © 1994-1996 by the Regents of the University of California.

PDB Class Reference

The PDB class provides methods for parsing Brookhaven Protein DataBank (PDB) records (lines from a PDB file) into structures and expanding those structures back into PDB records. Rather than provide access functions for each possible field, the structure containing the parsed record is publicly available. The field names are mostly the same as those the given in the 2.0 version of the ``Atomic Coordinate Entry Format Description.'' The exceptions are:
  1. the HELIX class field is renamed type to avoid problems with C++;
  2. all residues are put into a Residue type, if the residue fields all start with a word, then that word is the name of the residue structure, otherwise it is res;
  3. multiple fields of the same name are made into arrays;
  4. array subscripts start at zero;
  5. in DBREF records, seqBegin and seqEnd are not arrays, the instances have a 2 appended to them;
  6. all field names that end with the letters id are written ID;
  7. instead of having separate structures for all three versions of ORIGXn, SCALEn, and MTRIXn, there is one for each record type;
  8. added missing field names, these may change in a future release.

    The PDB class has several enhancements to the Brookhaven Protein DataBank specification: four character residue names, the PDBRUN set of scene annotation records (see reference in the See Also section), and atom serial number overflow protection. Four character residue names work because everywhere in the specification a three character residue name appears, there is a blank afterwards. Atom serial number overflow protection works for ATOM, HETATM, and SIGATM records, by assigning an unique serial number over 10000. Unfortunately, it is not possible to fix PDB records that refer to atom serial numbers, such as CONECT records.

    PDB Member Constants

    BufLen
    The maximum length of a generated PDB record string (including the null byte).
    PDBRUNVersion
    The default version of the PDBRUN scene annotation records.

    There are also constants for each known PDB record type (e.g., ATOM, HETATM, etc.), and the constant UNKNOWN for an unknown PDB record.

    The following constants are for each PDBRUN scene annotation record type: USER_PDBRUN, USER_EYEPOS, USER_ATPOS, USER_WINDOW, USER_FOCUS, USER_VIEWPORT, USER_BGCOLOR, USER_ANGLE, USER_DISTANCE, USER_FILE, USER_MARKNAME, USER_MARK, USER_CNAME, USER_COLOR, USER_RADIUS, USER_OBJECT, USER_ENDOBJ, USER_CHAIN, USER_GFX_BEGIN, USER_GFX_END, USER_GFX_COLOR, USER_GFX_NORMAL, USER_GFX_VERTEX, USER_GFX_FONT, USER_GFX_TEXTPOS, and USER_GFX_LABEL.

    The following constants are for the various graphics primitives supported in scenes: GFX_UNKNOWN, GFX_POINTS, GFX_MARKERS, GFX_LINES, GFX_LINE_STRIP, GFX_LINE_LOOP, GFX_TRIANGLES, GFX_TRIANGLE_STRIP, GFX_TRIANGLE_FAN, GFX_QUADS, GFX_QUAD_STRIP, and GFX_POLYGON.

    PDB Member Types

    typedef char Atom[5]
    A PDB atom name, e.g., NO2*.
    typedef char Date[10]
    A text field containing a date, typically day\-month\-year, where day is numeric, month is a three-letter abbreviation, and year is the last two digits of the year.
    typedef char IDcode[4]
    Generic short id field.
    typedef double Real
    Size of floating point numbers read and written.
    typedef char ResidueName[5]
    Residue name, e.g., ALA.
    struct Residue
    A Residue consists of a residue name (name), a chain identifier (chainID), a sequence number (seqNum), and an insertion code (iCode).

    PDB Member Functions

    PDB()
    PDB(RecordType t)
    PDB(const char *buf)
    Constructors. The first two above construct a zeroed instance of the given record type (default UNKNOWN). The last constructor above fills in all of the fields of the instance from the given PDB record text.
    RecordType type() const
    Return the type of PDB instance.
    void setType(RecordType t)
    Change the PDB record type of the instance and reinitialize all the fields to default values (zero in all cases except for an ATOM's occupancy which defaults to 1.0).
    const char *c_str() const;
    Return a string containing the PDB record in textual form.
    static int PdbrunInputVersion()
    Return the current PDBRUN scene annotation version used to parse text records.
    static int PdbrunOutputVersion()
    Return the current PDBRUN scene annotation version used to create text records.
    static void setPdbrunInputVersion(int v)
    Set the current PDBRUN scene annotation version used to parse text records.
    static void setPdbrunOutputVersion(int v)
    Set the current PDBRUN scene annotation version used to create text records.
    static recordType getType(const char *buf)
    Return the PDB record type for the given line of text.
    static GfxType getGfxType(const char *buf)
    Return the graphics type of the given text. Used to parse USER GFX BEGIN records.
    static const char *gfxChars(GfxType gt)
    Return a string representation of a graphics type.
    static int sscanf(const char *, const char *, ...)
    A version of sscanf(3) whose format's behave like FORTRAN formats where field widths are sacrosanct. If the input line is short, then the rest of the fields are initialized to default values. Any literal characters in the format must match the input. The format characters are: space, ignore input character; c, character (array), default to a space; d, integer, default zero; f, double, default zero; s, get a C string, trailing spaces are stripped and it is null terminated, default an empty string. sscanf returns the number of input fields converted (may be less than expected if the input line is short) or -1 if an error is found.
    static int sprintf(char *, const char *, ...)
    A version of sprintf(3) whose format's behave like FORTRAN formats where field widths are sacrosanct. Literal characters are copied as is. If the text or number to be printed is larger than the given field width, then the field is filled in with asterisks. The format characters are: d, integer; D, integer where zero is written as spaces; s, right-justified string (a negative field width left-justifies); c, character (array), zero characters are converted to spaces; f, floating point, normal printf precisions are used.

    PDB I/O Functions

    ostream &operator<<(ostream &s, const PDB &p)
    Output the current PDB record on the given output stream.
    istream &operator>>(istream &s, PDB &p)
    Read a line from the given input stream and convert to a PDB record.

    See Also

    PDB version 2.0.
    ``Annotating PDB Files with Scene Information,'' Gregory S. Couch, et. al., Journal of Molecular Graphics, 13, 3, June 1995.

    Notes

    The subtype field of USERxx structure tells what the xx part was. The rest of the line, up to the card sequence portion, is the text field.

    Due to the way Brookhaven encodes their files, atom names often have leading blanks and sometimes have embedded blanks. Residue names occasionally have leading blanks too. To be entirely consistent with the PDB format, the programmer should put those blanks in before using the c_str member function.

    Bugs

    JRNL nor REMARK records are not fully parsed, even though there are additional PDB conventions.

    Routines are needed to convert to and from PDB typesetting conventions in COMPND, SOURCE, AUTHOR, and JRNL records.

    Example

    #include <PDB.h>
    
    ....
    PDB	record;
    
    while (cin >> record) {
    	switch (record.type()) {
    		case PDB::ATOM:
    			cout << record.atom.xyz[0] << ' ' << record.atom.xyz[0]
    				<< ' ' << record.atom.xyz[0] << endl;
    			....
    			break;
    		}
    	}
    ....