API Reference#

This reference documents all public classes and methods in the OpenProtein Python SDK.

Core Components#

Session#

The main entry point for interacting with the OpenProtein.AI platform.

openprotein.OpenProtein([username, ...])

The base class for accessing OpenProtein API functionality.

Jobs & Futures#

All API operations return Future objects for asynchronous job tracking. Use wait() to block until completion and retrieve results, or wait_until_done() followed by get() for more control.

openprotein.jobs.JobsAPI(session)

API interface to get jobs.

openprotein.jobs.Future(session, job)

Base class for all Futures returning results from a job.

openprotein.jobs.Job(*, job_id, job_type, ...)

openprotein.jobs.JobStatus(value[, names, ...])

JobStatus: PENDING, RUNNING, SUCCESS, FAILED

Data Primitives#

Core data structures for representing proteins, complexes, and experimental data.

Protein Structures#

openprotein.protein.Protein([sequence, ...])

Represents a protein with optional sequence, atomic coordinates, per-residue confidence scores (pLDDT), and name.

openprotein.chains.DNA(sequence[, chain_id, ...])

Represents a DNA sequence.

openprotein.chains.RNA(sequence[, chain_id, ...])

Represents an RNA sequence.

openprotein.chains.Ligand(*[, chain_id, ...])

Represents a ligand with optional Chemical Component Dictionary (CCD) identifier and SMILES string.

Protein represents a single protein chain with sequence and optional MSA/structure data.

DNA/RNA/Ligand represents non-protein components in a complex.

Model represents a multi-chain complex (proteins + chains) for structure prediction and analysis.

Assay Data#

openprotein.data.DataAPI(session)

API interface for calling AssayData endpoints

openprotein.data.AssayDataset(session, metadata)

Assay dataset which contains your sequences and measurements which can be used for training predictors.

openprotein.data.AssayMetadata(*, ...[, ...])

Upload and manage experimental datasets with measured properties for training predictors and design workflows.

Foundation Models & Embeddings#

Generate embeddings, logits, and scores using protein language models.

Embedding API#

openprotein.embeddings.EmbeddingsAPI(session)

Embeddings API providing the interface for creating embeddings using protein language models.

Model Classes#

Each model class provides access to specific foundation models with their unique capabilities.

openprotein.embeddings.PoET2Model(session, ...)

Class for OpenProtein's foundation model PoET 2.

openprotein.embeddings.PoETModel(session, ...)

Class for OpenProtein's foundation model PoET.

openprotein.embeddings.ESMModel(session, ...)

Class providing inference endpoints for Facebook's ESM protein language models.

openprotein.embeddings.OpenProteinModel(...)

Proprietary protein embedding models served by OpenProtein.

PoET2Model: Multimodal conditional model with structure and sequence conditioning (1536 dim)

PoETModel: Sequence-conditioned generative model for scoring and generation (1280 dim)

ESMModel: Meta’s ESM family (ESM1b, ESM1v, ESM2 variants, 320-2560 dim)

OpenProteinModel: OpenProtein’s proprietary models (prot-seq, rotaprot variants, 1024-1536 dim)

Result Types#

openprotein.embeddings.EmbeddingsResultFuture(...)

Future for manipulating results for embeddings-related requests.

openprotein.embeddings.EmbeddingsScoreFuture(...)

Future for manipulating results for embeddings score-related requests.

openprotein.embeddings.EmbeddingsGenerateFuture(...)

Future for manipulating results for embeddings generate-related requests.

PoET Prompts & Queries#

Create and manage prompts for conditioning PoET models on specific protein families.

openprotein.prompt.PromptAPI(session)

Prompt API providing the interface to create prompts for use with PoET models.

openprotein.prompt.Prompt(session[, job, ...])

Prompt which contains a set of sequences and/or structures used to condition the PoET models.

openprotein.prompt.Query(session, metadata)

Query containing a sequence/structure used to query the PoET-2 model which opens up new workflows.

Sequence Alignment#

Generate multiple sequence alignments (MSA) for evolutionary analysis and structure prediction.

openprotein.align.AlignAPI(session)

Align API interface for creating alignments and MSAs (multiple sequence alignments) which can be used for other protein tasks.

openprotein.align.MSAFuture(session, job[, ...])

Represents a future for MSA (Multiple Sequence Alignment) results.

Supports MAFFT, ClustalOmega, and homology search via MMseqs2. Required for AlphaFold2 and Boltz folding.

Dimensionality Reduction#

Reduce embedding dimensions for visualization and downstream analysis.

SVD#

openprotein.svd.SVDAPI(session)

SVD API providing the interface for creating and using SVD models.

openprotein.svd.SVDModel(session[, job, ...])

SVD model that can be used to create reduced embeddings.

UMAP#

openprotein.umap.UMAPAPI(session)

UMAP API providing the interface to fit and run UMAP visualizations.

openprotein.umap.UMAPModel(session[, job, ...])

UMAP model that can be used to create projected embeddings.

Property Prediction#

Train Gaussian Process models on assay data and predict properties for novel sequences.

openprotein.predictor.PredictorAPI(session)

Predictor API providing the interface to train and predict predictors.

openprotein.predictor.PredictorModel(session)

Class providing predict endpoint for fitted predictor models.

openprotein.predictor.PredictionResultFuture(...)

Prediction results represented as a future.

Train on AssayDataset with any embedding model, then predict fitness, stability, or custom properties.

Sequence Design#

Generate optimized sequences using genetic algorithms and trained predictors.

openprotein.design.DesignAPI(session)

Design API providing the interface to design novel proteins based on your design criteria.

openprotein.design.DesignFuture(session[, ...])

A future object that will hold the results of the design job.

Design sequences to maximize predicted properties while maintaining similarity to parent sequences.

Structure Prediction#

Predict 3D structures from sequences using state-of-the-art folding models.

Fold API#

openprotein.fold.FoldAPI(session)

Fold API provides a high level interface for making protein structure predictions.

Folding Models#

openprotein.fold.ESMFoldModel(session, model_id)

Class providing inference endpoints for Facebook's ESMFold structure prediction models.

openprotein.fold.AlphaFold2Model(session, ...)

Class providing inference endpoints for AlphaFold2 structure prediction models, based on the implementation by ColabFold.

openprotein.fold.Boltz1Model(session, model_id)

Class providing inference endpoints for Boltz-1 open-source structure prediction model.

openprotein.fold.Boltz1xModel(session, model_id)

Class providing inference endpoints for Boltz-1x open-source structure prediction model, which adds the use of inference potentials to improve performance.

openprotein.fold.Boltz2Model(session, model_id)

Class providing inference endpoints for Boltz-2 structure prediction model which jointly models complex structures and binding affinities.

ESMFold: Fast single-chain folding, no MSA required

AlphaFold2: High-accuracy multi-chain folding, requires MSA

Boltz1/1x/2: Multi-chain folding with constraints and affinity prediction

Result Types#

openprotein.fold.FoldResultFuture(session[, ...])

Fold results represented as a future.

openprotein.fold.FoldComplexResultFuture(session)

Future for manipulating results of a fold complex request.

Structure Generation#

Generate novel protein structures using diffusion models.

openprotein.models.ModelsAPI(session)

API-like accessor that groups all available protein models.

openprotein.models.foundation.rfdiffusion.RFdiffusionModel(session)

RFdiffusion model for generating de novo protein structures.

openprotein.models.foundation.boltzgen.BoltzGenModel(session)

BoltzGen model for generating de novo protein structures.

RFdiffusion: Diffusion-based structure generation for binder design

BoltzGen: Generative model for protein structure design

Enumerations & Constants#

Common enumerations used throughout the SDK.

ReductionType: MEAN, SUM for embedding reduction