openprotein.models#

Unified access to models on the OpenProtein AI platform. Use them to work at a lower level to craft your own workflows.

Note that the Models API is a WIP interface, but we are working hard on bringing all models here for a consistent and simple developer experience.

Interface#

class openprotein.models.ModelsAPI(session)[source]#

API-like accessor that groups all available protein models.

This class is attached to the main APISession and provides a single, consistent entry point for accessing various models.

Models#

RFdiffusion#

RFdiffusion is diffusion model that can be used for de novo structure design and binder design. It can be used with our Query interface to define structure prediction objectives in a unified manner. It also supports taking in the contigs defined in official RFdiffusion repo.

Results#

class openprotein.models.RFdiffusionFuture(session, job)[source]#

Future for handling the results of an RFdiffusion job.

get_pdb(replicate=0)[source]#

Retrieve the PDB file for a specific design.

Parameters:

design_index (int) – The 0-based index of the design to retrieve.

Returns:

The content of the PDB file as a string.

Return type:

str

get(replicate=0)[source]#

Default result accessor, returns the first PDB.

cancelled()#

Check if the job has been cancelled.

Returns:

True if the job is cancelled, False otherwise.

Return type:

bool

property created_date: datetime#

The creation timestamp of the job.

done()#

Check if the job has completed.

Returns:

True if the job is done, False otherwise.

Return type:

bool

property end_date: datetime | None#

The end timestamp of the job.

property id: str#

The unique identifier of the job.

property job_id: str#

The unique identifier of the job.

property job_type: str#

The type of the job.

property progress_counter: int#

The progress counter of the job.

refresh()#

Refresh the job status and internal job object.

property start_date: datetime | None#

The start timestamp of the job.

property status: JobStatus#

The current status of the job.

wait(interval=5, timeout=None, verbose=False)#

Wait for the job to complete, then fetch results.

Parameters:
  • interval (int, optional) – Time in seconds between polling. Defaults to config.POLLING_INTERVAL.

  • timeout (int | None, optional) – Maximum time in seconds to wait. Defaults to None.

  • verbose (bool, optional) – Verbosity flag. Defaults to False.

Returns:

The results of the job.

Return type:

Any

wait_until_done(interval=5, timeout=None, verbose=False)#

Wait for the job to complete.

Parameters:
  • interval (float, optional) – Time in seconds between polling. Defaults to config.POLLING_INTERVAL.

  • timeout (int, optional) – Maximum time in seconds to wait. Defaults to None.

  • verbose (bool, optional) – Verbosity flag. Defaults to False.

Returns:

True if the job completed successfully.

Return type:

bool

Notes

This method does not fetch the job results, unlike wait().

BoltzGen#

BoltzGen is a structure generation model that can be used for generating de novo structures along with nanobody scaffolds. It can be used with our Query interface to define structure prediction objectives in a unified manner. It also supports taking in a design_spec which follows the official design specification from BoltzGen.

Results#

class openprotein.models.BoltzGenFuture(session, job)[source]#

Future for handling the results of an BoltzGen job.

get_pdb(replicate=0)[source]#

Retrieve the PDB file for a specific design.

Parameters:

design_index (int) – The 0-based index of the design to retrieve.

Returns:

The content of the PDB file as a string.

Return type:

str

get(replicate=0)[source]#

Default result accessor, returns the first PDB.

cancelled()#

Check if the job has been cancelled.

Returns:

True if the job is cancelled, False otherwise.

Return type:

bool

property created_date: datetime#

The creation timestamp of the job.

done()#

Check if the job has completed.

Returns:

True if the job is done, False otherwise.

Return type:

bool

property end_date: datetime | None#

The end timestamp of the job.

property id: str#

The unique identifier of the job.

property job_id: str#

The unique identifier of the job.

property job_type: str#

The type of the job.

property progress_counter: int#

The progress counter of the job.

refresh()#

Refresh the job status and internal job object.

property start_date: datetime | None#

The start timestamp of the job.

property status: JobStatus#

The current status of the job.

wait(interval=5, timeout=None, verbose=False)#

Wait for the job to complete, then fetch results.

Parameters:
  • interval (int, optional) – Time in seconds between polling. Defaults to config.POLLING_INTERVAL.

  • timeout (int | None, optional) – Maximum time in seconds to wait. Defaults to None.

  • verbose (bool, optional) – Verbosity flag. Defaults to False.

Returns:

The results of the job.

Return type:

Any

wait_until_done(interval=5, timeout=None, verbose=False)#

Wait for the job to complete.

Parameters:
  • interval (float, optional) – Time in seconds between polling. Defaults to config.POLLING_INTERVAL.

  • timeout (int, optional) – Maximum time in seconds to wait. Defaults to None.

  • verbose (bool, optional) – Verbosity flag. Defaults to False.

Returns:

True if the job completed successfully.

Return type:

bool

Notes

This method does not fetch the job results, unlike wait().

ProteinMPNN#

ProteinMPNN is a sequence generation model that can be used for inverse folding, and is a natural next step after using structure generation models. It can be used with our Query interface to define sequence generation objectives in a unified manner, similar to our PoET2Model.

class openprotein.models.ProteinMPNNModel(session)[source]#

Class for ProteinMPNN model.

Model inference requires an input structure which is provided by a query.

Examples

View specific model details (including supported tokens) with the ? operator.

Examples

>>> import openprotein
>>> session = openprotein.connect(username="user", password="password")
>>> session.models.proteinmpnn?
get_metadata()[source]#

Get model metadata for this model.

Returns:

The metadata associated with this model.

Return type:

ModelMetadata

score(sequences, query)[source]#

Score query sequences based on the specified query.

Parameters:
  • sequences (list of bytes) – Sequences to score.

  • query (str or bytes or Protein or Query or None, optional) – Query to use with prompt.

Returns:

A future object that returns the scores of the submitted sequences.

Return type:

EmbeddingsScoreFuture

indel(sequence, query, insert=None, delete=None, **kwargs)[source]#

Score all indels of the query sequence based on the specified query.

Parameters:
  • sequence (bytes) – Sequence to analyze.

  • query (str or bytes or Protein or Query or None, optional) – Query to use with prompt.

  • insert (str or None, optional) – Insertion fragment at each site.

  • delete (list of int or None, optional) – Range of size of fragment to delete at each site.

  • **kwargs – Additional keyword arguments.

Returns:

A future object that returns the scores of the indel-ed sequence.

Return type:

EmbeddingsScoreFuture

Raises:

ValueError – If neither insert nor delete is provided.

single_site(sequence, query)[source]#

Score all single substitutions of the query sequence using the specified query.

Parameters:
  • sequence (bytes) – Sequence to analyze.

  • query (str or bytes or Protein or Query or None, optional) – Query to use with prompt.

Returns:

A future object that returns the scores of the mutated sequence.

Return type:

EmbeddingsScoreFuture

generate(query, num_samples=100, temperature=1.0)[source]#

Generate protein sequences based on a masked input query.

Parameters:
  • query (str or bytes or Protein or Query) – Query specifying the structure to generate sequences for.

  • num_samples (int, optional) – The number of samples to generate. Default is 100.

  • temperature (float, optional) – The temperature for sampling. Higher values produce more random outputs. Default is 1.0.

Returns:

A future object representing the status and information about the generation job.

Return type:

EmbeddingsGenerateFuture