io: Manage file formats that can be opened and exported¶
The io module keeps track of the functions that can open, fetch, and export data in various formats.
I/O sources and destinations are specified as filenames, and the appropriate
open or export function is found by deducing the format from the suffix of the
filename.
An additional compression suffix, i.e., .gz
,
indicates that the file is or should be compressed.
In addition to reading data from files,
data can be fetched from the Internet.
In that case, instead of a filename,
the data source is specified as prefix:identifier, e.g., pdb:1gcn
, where
the prefix identifies the data format, and the identifier selects the data.
All data I/O is in binary.
-
register_format
(format_name, category, extensions, prefixes=(), mime=(), reference=None, dangerous=None, icon=None, **kw)¶ Register file format’s I/O functions and meta-data
Parameters: - format_name – format’s name
- category – says what kind of data the should be classified as.
- extensions – is a sequence of filename suffixes starting with a period. If the format doesn’t open from a filename (e.g., PDB ID code), then extensions should be an empty sequence.
- prefixes – is a sequence of filename prefixes (no ‘:’), possibily empty.
- mime – is a sequence of mime types, possibly empty.
- reference – a URL link to the specification.
- dangerous – should be True for formats that can write/delete a users’s files. False by default except for the SCRIPT category.
Todo
possibly break up in to multiple functions
-
register_open
(format_name, open_function, requires_filename=False)¶ register a function that reads data from a stream
Parameters: - open_function – function taking an I/O stream or filename and returns a 2-tuple with a list of models and a status message
- requires_filename – True if first argument must be a filename
-
register_fetch
(format_name, fetch_function)¶ register a function that fetches data from the Internet
Parameters: fetch_fuction – function that takes an identifier, and returns an I/O stream for reading data, and identifying name. Usually the name is the same as the identifier.
-
formats
()¶ Return all known format names
-
open_data
(session, filespec, as_a=None, label=None, **kw)¶ open a (compressed) file
Parameters: - filespec – ‘’‘prefix:id’‘’ or a (compressed) filename
- as_a – file as if it has the given format
- label – optional name used to identify data source
If a file format requires a filename, then compressed files are uncompressed into a temporary file before calling the open function.
-
open_multiple_data
(session, filespecs, **kw)¶ Open one or more files, including handling formats where multiple files contribute to a single model, such as image stacks.
-
prefixes
(format_name)¶ Return filename prefixes for named format.
prefixes(format_name) -> [filename-prefix(es)]
-
extensions
(format_name)¶ Return filename extensions for named format.
extensions(format_name) -> [filename-extension(s)]
-
open_function
(format_name)¶ Return open callback for named format.
open_function(format_name) -> function
-
fetch_function
(format_name)¶ Return fetch callback for named format.
fetch_function(format_name) -> function
-
export_function
(format_name)¶ Return export callback for named format.
export_function(format_name) -> function
-
mime_types
(format_name)¶ Return mime types for named format.
-
requires_filename
(format_name)¶ Return whether named format can needs a seekable file
-
dangerous
(format_name)¶ Return whether named format can write to files
-
category
(format_name)¶ Return category of named format
-
format_names
(open=True, export=False, source_is_file=False)¶ Returns list of known format names.
formats() -> [format-name(s)]
By default only returns the names of openable formats.
-
categorized_formats
(open=True, export=False)¶ Return known formats by category
categorized_formats() -> { category: formats() }
-
deduce_format
(filename, has_format=None, prefixable=True)¶ Figure out named format associated with filename
Return tuple of deduced format name, whether it was a prefix reference, the unmangled filename, and the compression format (if present). If it is a prefix reference, then it needs to be fetched.
Example:
io.register_format("mmCIF", "Molecular structure",
(".cif",), ("mmcif", "cif"),
mime=("chemical/x-cif", "chemical/x-mmcif"),
reference="http://mmcif.wwpdb.org/",
requires_seeking=True, open_func=open_mmCIF)