openprotein.fold#

Create PDBs of your protein sequences via our folding models!

Note that for Boltz and AlphaFold2 Models, you will also need to utilize our align workflow to create MSAs.

Interface#

class openprotein.fold.FoldAPI(session)[source]#

Fold API provides a high level interface for making protein structure predictions.

boltz2: Boltz2Model#: Boltz-2 model

boltz_2: Boltz2Model#

boltz1x: Boltz1xModel#: Boltz-1x model

boltz_1x: Boltz1xModel#

boltz1: Boltz1Model#: Boltz-1 model

boltz_1: Boltz1Model#

af2: AlphaFold2Model#: AlphaFold-2 model

alphafold2: AlphaFold2Model#

rf3: RosettaFold3Model#: RosettaFold-3 model

rosettafold_3: RosettaFold3Model#

esmfold: ESMFoldModel#: ESMFold model

minifold: MiniFoldModel#: MiniFold model

list_models()[source]#

list models available for creating folds of your sequences

get_model(model_id)[source]#

Get model by model_id.

FoldModel allows all the usual job manipulation: e.g. making POST and GET requests for this model specifically.

Parameters:: model_id (str) – the model identifier
Returns:: The model
Return type:: FoldModel
Raises:: HTTPError – If the GET request does not succeed.

get_results(job)[source]#

Retrieves the results of a fold job.

Parameters:: job (Job) – The fold job whose results are to be retrieved.
Returns:: An instance of FoldResultFuture
Return type:: FoldResultFuture

Models#

class openprotein.fold.Boltz2Model(session, model_id, metadata=None)[source]#

Class providing inference endpoints for Boltz-2 structure prediction model which jointly models complex structures and binding affinities.

fold(proteins=None, dnas=None, rnas=None, ligands=None, diffusion_samples=1, num_recycles=3, num_steps=200, step_scale=1.638, use_potentials=False, constraints=None, templates=None, properties=None, method=None)[source]#

Request structure prediction with Boltz-2 model.

Parameters:

proteins (List[Protein] | MSAFuture | None) – List of protein sequences to include in folded output. Protein objects must be tagged with an msa, which can be a Protein.single_sequence_mode for single sequence mode. Alternatively, supply an MSAFuture to use all query sequences as a multimer.
dna (List[DNA] | None) – List of DNA sequences to include in folded output.
rna (List[RNA] | None) – List of RNA sequences to include in folded output.
ligands (List[Ligand] | None) – List of ligands to include in folded output.
diffusion_samples (int) – Number of diffusion samples to use
num_recycles (int) – Number of recycling steps to use
num_steps (int) – Number of sampling steps to use
step_scale (float) – Scaling factor for diffusion steps.
use_potentials (bool = False.) – Whether or not to use potentials.
constraints (list[dict] | None = None) – List of constraints.
templates (list[dict] | None = None) – List of templates to use for structure prediction.
properties (list[dict] | None = None) – List of additional properties to predict. Should match the BoltzProperties
method (str | None) – The experimental method or supervision source used for the prediction. Defults to None. Supported values (case-insensitive) include: ‘MD’, ‘X-RAY DIFFRACTION’, ‘ELECTRON MICROSCOPY’, ‘SOLUTION NMR’, ‘SOLID-STATE NMR’, ‘NEUTRON DIFFRACTION’, ‘ELECTRON CRYSTALLOGRAPHY’, ‘FIBER DIFFRACTION’, ‘POWDER DIFFRACTION’, ‘INFRARED SPECTROSCOPY’, ‘FLUORESCENCE TRANSFER’, ‘EPR’, ‘THEORETICAL MODEL’, ‘SOLUTION SCATTERING’, ‘OTHER’, ‘AFDB’, ‘BOLTZ-1’. View the documentation on Boltz for upstream details.

Returns:

Future for the folding result.

Return type:

FoldComplexResultFuture

class openprotein.fold.Boltz1xModel(session, model_id, metadata=None)[source]#

Class providing inference endpoints for Boltz-1x open-source structure prediction model, which adds the use of inference potentials to improve performance.

fold(proteins=None, dnas=None, rnas=None, ligands=None, diffusion_samples=1, num_recycles=3, num_steps=200, step_scale=1.638, constraints=None)[source]#

Request structure prediction with Boltz-1x model. Uses potentials with Boltz-1 model.

Parameters:

proteins (List[Protein] | MSAFuture | None) – List of protein sequences to include in folded output. Protein objects must be tagged with an msa, which can be a Protein.single_sequence_mode for single sequence mode. Alternatively, supply an MSAFuture to use all query sequences as a multimer.
dna (List[DNA] | None) – List of DNA sequences to include in folded output.
rna (List[RNA] | None) – List of RNA sequences to include in folded output.
ligands (List[Ligand] | None) – List of ligands to include in folded output.
diffusion_samples (int) – Number of diffusion samples to use
num_recycles (int) – Number of recycling steps to use
num_steps (int) – Number of sampling steps to use
step_scale (float) – Scaling factor for diffusion steps.
constraints (Optional[List[dict]]) – List of constraints.

Returns:

Future for the folding complex result.

Return type:

FoldComplexResultFuture

class openprotein.fold.Boltz1Model(session, model_id, metadata=None)[source]#

Class providing inference endpoints for Boltz-1 open-source structure prediction model.

fold(proteins=None, dnas=None, rnas=None, ligands=None, diffusion_samples=1, num_recycles=3, num_steps=200, step_scale=1.638, use_potentials=False, constraints=None)[source]#

Request structure prediction with Boltz-1 model.

Parameters:

proteins (List[Protein] | MSAFuture | None) – List of protein sequences to include in folded output. Protein objects must be tagged with an msa, which can be a Protein.single_sequence_mode for single sequence mode. Alternatively, supply an MSAFuture to use all query sequences as a multimer.
dna (List[DNA] | None) – List of DNA sequences to include in folded output.
rna (List[RNA] | None) – List of RNA sequences to include in folded output.
ligands (List[Ligand] | None) – List of ligands to include in folded output.
diffusion_samples (int) – Number of diffusion samples to use
num_recycles (int) – Number of recycling steps to use
num_steps (int) – Number of sampling steps to use
step_scale (float) – Scaling factor for diffusion steps.
use_potentials (bool = False.) – Whether or not to use potentials.
constraints (Optional[List[dict]]) – List of constraints.

Returns:

Future for the folding complex result.

Return type:

FoldComplexResultFuture

class openprotein.fold.AlphaFold2Model(session, model_id, metadata=None)[source]#

Class providing inference endpoints for AlphaFold2 structure prediction models, based on the implementation by ColabFold.

fold(proteins=None, num_recycles=None, num_models=1, num_relax=0, **kwargs)[source]#

Post sequences to alphafold model.

Parameters:

proteins (List[Protein] | MSAFuture) – List of protein sequences to fold. Protein objects must be tagged with an msa. Alternatively, supply an MSAFuture to use all query sequences as a multimer.
num_recycles (int) – number of times to recycle models
num_models (int) – number of models to train - best model will be used
num_relax (int) – maximum number of iterations for relax

Returns:

job

Return type:

Job

class openprotein.fold.ESMFoldModel(session, model_id, metadata=None)[source]#

Class providing inference endpoints for Facebook’s ESMFold structure prediction models.

model_id: str = 'esmfold'#

fold(sequences, num_recycles=None)[source]#

Fold sequences using this model.

Parameters:

sequences (Sequence[bytes | str]) – sequences to fold
num_recycles (int | None) – number of times to recycle models

Return type:

FoldResultFuture

Results#

class openprotein.fold.FoldResultFuture(session, job=None, metadata=None, sequences=None, max_workers=10)[source]#

Fold results represented as a future.

job#

The fold job associated with this future.

Type:: FoldJob

classmethod create(session, job=None, metadata=None, **kwargs)[source]#

Factory method to create a FoldResultFuture or FoldComplexResultFuture.

Parameters:

session (APISession) – The API session to use for requests.
job (FoldJob) – The fold job associated with this future.
**kwargs – Additional keyword arguments.

Returns:

An instance of FoldResultFuture or FoldComplexResultFuture depending on the model.

Return type:

FoldResultFuture or FoldComplexResultFuture

property sequences: list[bytes]#

Get the sequences submitted for the fold request.

Returns:: List of sequences.
Return type:: list[bytes]

property id#

Get the ID of the fold request.

Returns:: Fold job ID.
Return type:: str

property metadata: FoldMetadata#: The fold metadata.

property model_id: str#: The fold model used.

get(verbose=False)[source]#

Retrieve the fold results as a list of tuples mapping sequence to PDB-encoded string.

Parameters:: verbose (bool, optional) – If True, print verbose output. Default is False.
Returns:: List of tuples mapping sequence to PDB-encoded string.
Return type:: list[tuple[str, str]]

get_item(sequence)[source]#

Get fold results for a specified sequence.

Parameters:: sequence (bytes) – Sequence to fetch results for.
Returns:: Fold result for the specified sequence.
Return type:: bytes

cancelled()#

Check if the job has been cancelled.

Returns:: True if the job is cancelled, False otherwise.
Return type:: bool

property created_date: datetime#: The creation timestamp of the job.

done()#

Check if the job has completed.

Returns:: True if the job is done, False otherwise.
Return type:: bool

property end_date: datetime | None#: The end timestamp of the job.

property job_id: str#: The unique identifier of the job.

property job_type: str#: The type of the job.

property progress_counter: int#: The progress counter of the job.

refresh()#: Refresh the job status and internal job object.

property start_date: datetime | None#: The start timestamp of the job.

property status: JobStatus#: The current status of the job.

stream()#

Retrieve results for this job as a stream.

Returns:: A generator that yields (key, value) tuples.
Return type:: Generator

wait(interval=5, timeout=None, verbose=False)#

Wait for the job to complete, then fetch results.

Parameters:

interval (int, optional) – Time in seconds between polling. Defaults to config.POLLING_INTERVAL.
timeout (int | None, optional) – Maximum time in seconds to wait. Defaults to None.
verbose (bool, optional) – Verbosity flag. Defaults to False.

Returns:

The results of the job.

Return type:

Any

wait_until_done(interval=5, timeout=None, verbose=False)#

Wait for the job to complete.

Parameters:

interval (float, optional) – Time in seconds between polling. Defaults to config.POLLING_INTERVAL.
timeout (int, optional) – Maximum time in seconds to wait. Defaults to None.
verbose (bool, optional) – Verbosity flag. Defaults to False.

Returns:

True if the job completed successfully.

Return type:

bool

Notes

This method does not fetch the job results, unlike wait().