Boltz
is an artificial-intelligence method for predicting biomolecular structures
containing proteins, RNA, DNA, and/or other molecules.
Inspired by AlphaFold3, Boltz is fully open-source and freely available
for both academic and commercial use under the MIT license. See:
Boltz-1 Democratizing Biomolecular Interaction Modeling.
Wohlwend J, Corso G, Passaro S, Reveiz M, Leidal K, Swiderski W, Portnoi T, Chinn I, Silterra J, Jaakkola T, Barzilay R.
bioRxiv [Preprint]. 2024 Dec 27:2024.11.19.624167.
The boltz command is also implemented as the
Boltz tool.
Boltz-predicted structures vary in confidence levels (see
coloring)
and should be interpreted with caution. Boltz residue-residue
predicted aligned error (PAE) values can be plotted with
alphafold pae.
See the
ChimeraX Boltz details and
video
Boltz structure prediction in ChimeraX.
See also:
alphafold,
esmfold,
modeller,
computational screening for protein-protein interactions
Installing Boltz
Running a Boltz Prediction
Limitations
[back to top: boltz]
Installing Boltz
Usage:
boltz install [ directory ]
[ downloadModelWeightsAndCcd true | false ]
The boltz install command creates a Python virtual environment
to install Boltz from PyPi.
If no directory is specified, then ~/boltz2 in the user's home directory
is used. The directory will be created, or if it already exists must be empty.
The Boltz network parameters and Chemical Component Dictionary are downloaded
to ~/.boltz. An index is created of the atom counts for each CCD code
so that the ChimeraX Boltz interface
can report the total number of tokens (residues plus ligand atoms)
in an assembly for judging whether the computer has enough memory
to make the requested prediction.
The ChimeraX Python executable is used to create the virtual environment.
If the ChimeraX installation is moved or deleted, Boltz will need to be
reinstalled. It will also stop working if the boltz directory itself is moved,
since the executable refers to the install location to find python.
(Otherwise, the installation need only be done once per computer.)
The following commands are used to make the virtual environment and install
Boltz. On Windows, if Nvidia graphics is detected, a version of torch
with CUDA 12.6 support is installed before boltz.
python -m venv directory
directory/bin/python -m pip install torch --index-url https://download.pytorch.org/whl/cu126 # On Windows with Nvidia GPU only.
directory/bin/python -m pip install boltz
directory/bin/python chimerax/site-packages/boltz/download_weights_and_ccd.py
[back to top: boltz]
Running a Boltz Prediction
Usage:
boltz predict [ sequences ]
[ protein sequences ]
[ dna sequences ]
[ rna sequences ]
[ ligands residue-spec [ excludeLigands CCD-names ]]
[ ligandCcd CCD-names ]
[ ligandSmiles SMILES-string ]
[ affinity ligand-name ]
[ name prediction-name ]
[ resultsDirectory directory ]
[ device default | cpu | gpu ]
[ kernels true | false ]
[ precision 16 | 32 | bf16-true | bf16-mixed]
[ steering true | false ]
[ samples N ]
[ recycles M ]
[ seed K ]
[ useMsaCache true false ]
[ open true false ]
[ installLocation directory ]
[ wait true false ]
Biopolymer chains.
The sequences of biopolymer chains to predict can be given as a
comma-separated list of any of the following:
- a chain-spec
for one or more chains in atomic structure(s) open in ChimeraX
- the sequence-spec of a sequence
in the Sequence Viewer,
in the form:
alignment-ID:sequence-ID
(details...)
- a UniProt
name or accession number for a protein chain
- a plain-text string of 1-letter residue codes pasted directly into
the command line
If given with the protein, dna, or rna keywords,
the sequences argument can be of the same form as described above,
but chains other than protein, DNA, or RNA (respectively) will be excluded.
The dna option will interpret single-letter codes as DNA,
the rna option will interpret single-letter codes as RNA,
and neither will accept
UniProt identifiers
since they are only for protein chains.
The protein, dna, and rna options
can be used more than once in the same command.
Ligand, cofactor, and ion components.
Residues present in currently open structures can be specified with
ligands residue-spec,
optionally with excludeLigands to omit specific types from that set.
For example, if ligands #1 was given but not all of
the small molecules in #1 are wanted, excludeLigands
would be used to list which residue types to omit.
Residues to exclude are specified by a comma-separated list of their
3- or 5-letter residue names in the PDB Chemical Component Dictionary (CCD).
By default, CCD name HOH (water) is excluded.
Ligands to include can also be specified by a comma-separated list of
CCD names with the ligandCcd option, or by a comma-separated list of
SMILES strings with the ligandSmiles option.
The ligandCcd and ligandSmiles options can be used more than
once in the same command.
Calculation options:
- affinity ligand-name
– whether to predict the affinity of a ligand (if not specified,
no affinity will be predicted).
Boltz can predict the binding affinity in µM for a single ligand.
It was trained using Kd, Ki, and IC50 affinity values, treating them as
equivalent, so the predicted affinity should be interpreted as a
qualitative affinity without a precise definition.
Only one affinity prediction is made even if the system contains
multiple ligands, and the affinity cannot be predicted for
ligands that occur in more than one copy.
The ligand-name is the same CCD code or SMILES string used to
specify it as part of the complex to be predicted.
If a residue-spec in an
existing model was used, the residue name is assumed to be a CCD code.
- name prediction-name
– a name for the prediction
to be used in naming the output folder and files
- resultsDirectory directory
– the pathname (name and location) of a folder or directory
in which to store prediction results. It may include “[name]”
to indicate substitution by the specified prediction-name.
If the folder already exists, a numeric suffix _1, _2, _3... will be appended.
- device default | cpu | gpu
– whether to run the computation on the GPU or CPU.
The default setting chooses based on the availability of an Nvidia
or Mac M series GPU and Torch support for the GPU.
- kernels true | false
– whether to use Nvidia CUDA optimization kernels.
This was 30% faster on a test with 100 protein monomers with 100-700 residues,
with no loss of accuracy; default is true on Linux with GPU, otherwise false.
- precision 16 | 32 | bf16-true | bf16-mixed
– precision of floating-point operations by PyTorch_Lightning,
the machine-learning toolkit used by Boltz:
- 16 – 16-bit IEEE floating point
- 32 – 32-bit IEEE floating point
- bf16-true – always use bfloat16 (bf16),
a different format of 16-bit floating point that maintains the
dynamic range of 32-bit (1e37 max value) and sacrifices digits only,
about 3 digits of precision
- bf16-mixed (default on Linux+Nvidia)
– use bfloat16 for some parts of the calculation,
32 for others that typically require a higher precision
- steering true | false
– whether to use Boltz diffusion steering potentials,
which are claimed to increase accuracy, but have been observed to
increase run times (1.25x on Nvidia, 2.5x on Mac).
- samples N
– number of predictions (default 1). This is what Boltz calls
"diffusion samples." Creating additional structures takes much less time
than creating the first structure.
- recycles M
– number of passes through the neural net that derives spatial
information from which structures will be computed (default 3).
Higher numbers (e.g. 10) may give better structures but at the cost
of increased runtime.
- seed K
– random number seed (an integer) to initialize the calculation.
Runs with different seeds will give different results.
- useMsaCache true | false
– whether to cache (and potentially reuse) the deep sequence alignments
generated by the Colabfold server for protein chains (default true).
The alignment cache location is ~/Downloads/ChimeraX/BoltzMSA/.
Reusing an alignment saves time when multiple predictions will be performed
for the same protein or set of proteins but different small-molecule ligands.
Because the alignments for different proteins in an assembly are paired
to match ones from the same organisms, the cached alignments can only be reused
for assemblies with the exact same set of proteins.
Alignments computed for individual proteins from multiple different runs
cannot be used for an assembly of those proteins.
- open true | false
– whether to open the predicted structures when the prediction finishes
(default true). Initial coloring is by residue confidence values
(details...).
The structures will be aligned to an already open model with
matchmaker
if that open model was the first used in specifying the assembly.
- installLocation directory
– where Boltz is installed.
If specified, this sets the default location for future ChimeraX sessions.
- wait true | false
– whether the calculation should freeze ChimeraX until it finishes
or allow ChimeraX use during the computation (default, wait false).
UCSF Resource for Biocomputing, Visualization, and Informatics /
September 2025