Boltz is an artificial-intelligence method for predicting biomolecular structures containing proteins, RNA, DNA, and/or other molecules. Inspired by AlphaFold3, Boltz is fully open-source and freely available for both academic and commercial use under the MIT license. See:
Boltz-1 Democratizing Biomolecular Interaction Modeling. Wohlwend J, Corso G, Passaro S, Reveiz M, Leidal K, Swiderski W, Portnoi T, Chinn I, Silterra J, Jaakkola T, Barzilay R. bioRxiv [Preprint]. 2024 Dec 27:2024.11.19.624167.
Boltz-predicted structures vary in confidence levels (see coloring) and should be interpreted with caution. Boltz residue-residue predicted aligned error (PAE) values can be plotted with alphafold pae. The boltz command is also implemented as the Boltz tool. See the ChimeraX Boltz example and video Boltz structure prediction in ChimeraX. See also: alphafold, esmfold, modeller
Installing Boltz
Running a Boltz Prediction
Limitations
Usage: boltz install [ directory ] [ downloadModelWeightsAndCcd true | false ]
The boltz install command creates a Python virtual environment to install Boltz from PyPi. If no directory is specified, then ~/boltz in the user's home directory is used. The directory will be created, or if it already exists must be empty. The Boltz network parameters and Chemical Component Dictionary are downloaded to ~/.boltz. An index is created of the atom counts for each CCD code so that the ChimeraX Boltz interface can report the total number of tokens (residues plus ligand atoms) in an assembly for judging whether the computer has enough memory to make the requested prediction.
The ChimeraX Python executable is used to create the virtual environment. If the ChimeraX installation is moved or deleted, Boltz will need to be reinstalled. It will also stop working if the boltz directory itself is moved, since the executable refers to the install location to find python. (Otherwise, the installation need only be done once per computer.)
The following commands are used to make the virtual environment and install Boltz. On Windows, if Nvidia graphics is detected, a version of torch with CUDA 12.6 support is installed before boltz.
python -m venv directory directory/bin/python -m pip install torch --index-url https://download.pytorch.org/whl/cu126 # On Windows with Nvidia GPU only. directory/bin/python -m pip install boltz directory/bin/python chimerax/site-packages/boltz/download_weights_and_ccd.py
Usage: boltz predict [ sequences ] [ protein sequences ] [ dna sequences ] [ rna sequences ] [ ligands residue-spec [ excludeLigands CCD-names ]] [ ligandCcd CCD-names ] [ ligandSmiles SMILES-string ] [ name prediction-name ] [ resultsDirectory directory ] [ device default | cpu | gpu ] [ float16 true | false ] [ samples N ] [ recycles M ] [ seed K ] [ useMsaCache true false ] [ open true false ] [ installLocation directory ] [ wait true false ]
Biopolymer chains.
The sequences of biopolymer chains to predict can be given as a
comma-separated list of any of the following:
If given with the protein, dna, or rna keywords, the sequences argument can be of the same form as described above, but chains other than protein, DNA, or RNA (respectively) will be excluded. The dna option will interpret single-letter codes as DNA, the rna option will interpret single-letter codes as RNA, and neither will accept UniProt identifiers since they are only for protein chains. The protein, dna, and rna options can be used more than once in the same command.
Ligand, cofactor, and ion components.
Residues present in currently open structures can be specified with
ligands residue-spec,
optionally with excludeLigands to omit specific types from that set.
For example, if ligands #1 was given but not all of
the small molecules in #1 are wanted, excludeLigands
would be used to list which residue types to omit.
Residues to exclude are specified by a comma-separated list of their
3- or 5-letter residue names in the PDB Chemical Component Dictionary (CCD).
By default, CCD name HOH (water) is excluded.
Ligands to include can also be specified by a comma-separated list of
CCD names with the ligandCcd option, or by a comma-separated list of
SMILES strings with the ligandSmiles option.
The ligandCcd and ligandSmiles options can be used more than
once in the same command.
Calculation options: