API Reference#
This reference documents all public classes and methods in the OpenProtein Python SDK.
Core Components#
Session#
The main entry point for interacting with the OpenProtein.AI platform.
|
The base class for accessing OpenProtein API functionality. |
Jobs & Futures#
All API operations return Future objects for asynchronous job tracking. Use wait() to block until completion and retrieve results, or wait_until_done() followed by get() for more control.
|
API interface to get jobs. |
|
Base class for all Futures returning results from a job. |
|
|
|
JobStatus: PENDING, RUNNING, SUCCESS, FAILED
Data Primitives#
Core data structures for representing proteins, complexes, and experimental data.
Protein Structures#
|
Represents a protein with optional sequence, atomic coordinates, per-residue confidence scores (pLDDT), and name. |
|
Represents a DNA sequence. |
|
Represents an RNA sequence. |
|
Represents a ligand with optional Chemical Component Dictionary (CCD) identifier and SMILES string. |
Protein represents a single protein chain with sequence and optional MSA/structure data.
DNA/RNA/Ligand represents non-protein components in a complex.
Model represents a multi-chain complex (proteins + chains) for structure prediction and analysis.
Assay Data#
|
API interface for calling AssayData endpoints |
|
Assay dataset which contains your sequences and measurements which can be used for training predictors. |
|
Upload and manage experimental datasets with measured properties for training predictors and design workflows.
Foundation Models & Embeddings#
Generate embeddings, logits, and scores using protein language models.
Embedding API#
|
Embeddings API providing the interface for creating embeddings using protein language models. |
Model Classes#
Each model class provides access to specific foundation models with their unique capabilities.
|
Class for OpenProtein's foundation model PoET 2. |
|
Class for OpenProtein's foundation model PoET. |
|
Class providing inference endpoints for Facebook's ESM protein language models. |
Proprietary protein embedding models served by OpenProtein. |
PoET2Model: Multimodal conditional model with structure and sequence conditioning (1536 dim)
PoETModel: Sequence-conditioned generative model for scoring and generation (1280 dim)
ESMModel: Meta’s ESM family (ESM1b, ESM1v, ESM2 variants, 320-2560 dim)
OpenProteinModel: OpenProtein’s proprietary models (prot-seq, rotaprot variants, 1024-1536 dim)
Result Types#
Future for manipulating results for embeddings-related requests. |
|
Future for manipulating results for embeddings score-related requests. |
|
Future for manipulating results for embeddings generate-related requests. |
PoET Prompts & Queries#
Create and manage prompts for conditioning PoET models on specific protein families.
|
Prompt API providing the interface to create prompts for use with PoET models. |
|
Prompt which contains a set of sequences and/or structures used to condition the PoET models. |
|
Query containing a sequence/structure used to query the PoET-2 model which opens up new workflows. |
Sequence Alignment#
Generate multiple sequence alignments (MSA) for evolutionary analysis and structure prediction.
|
Align API interface for creating alignments and MSAs (multiple sequence alignments) which can be used for other protein tasks. |
|
Represents a future for MSA (Multiple Sequence Alignment) results. |
Supports MAFFT, ClustalOmega, and homology search via MMseqs2. Required for AlphaFold2 and Boltz folding.
Dimensionality Reduction#
Reduce embedding dimensions for visualization and downstream analysis.
SVD#
|
SVD API providing the interface for creating and using SVD models. |
|
SVD model that can be used to create reduced embeddings. |
UMAP#
|
UMAP API providing the interface to fit and run UMAP visualizations. |
|
UMAP model that can be used to create projected embeddings. |
Property Prediction#
Train Gaussian Process models on assay data and predict properties for novel sequences.
|
Predictor API providing the interface to train and predict predictors. |
|
Class providing predict endpoint for fitted predictor models. |
Prediction results represented as a future. |
Train on AssayDataset with any embedding model, then predict fitness, stability, or custom properties.
Sequence Design#
Generate optimized sequences using genetic algorithms and trained predictors.
|
Design API providing the interface to design novel proteins based on your design criteria. |
|
A future object that will hold the results of the design job. |
Design sequences to maximize predicted properties while maintaining similarity to parent sequences.
Structure Prediction#
Predict 3D structures from sequences using state-of-the-art folding models.
Fold API#
|
Fold API provides a high level interface for making protein structure predictions. |
Folding Models#
|
Class providing inference endpoints for Facebook's ESMFold structure prediction models. |
|
Class providing inference endpoints for AlphaFold2 structure prediction models, based on the implementation by ColabFold. |
|
Class providing inference endpoints for Boltz-1 open-source structure prediction model. |
|
Class providing inference endpoints for Boltz-1x open-source structure prediction model, which adds the use of inference potentials to improve performance. |
|
Class providing inference endpoints for Boltz-2 structure prediction model which jointly models complex structures and binding affinities. |
ESMFold: Fast single-chain folding, no MSA required
AlphaFold2: High-accuracy multi-chain folding, requires MSA
Boltz1/1x/2: Multi-chain folding with constraints and affinity prediction
Result Types#
|
Fold results represented as a future. |
Future for manipulating results of a fold complex request. |
Structure Generation#
Generate novel protein structures using diffusion models.
|
API-like accessor that groups all available protein models. |
|
RFdiffusion model for generating de novo protein structures. |
|
BoltzGen model for generating de novo protein structures. |
RFdiffusion: Diffusion-based structure generation for binder design
BoltzGen: Generative model for protein structure design
Enumerations & Constants#
Common enumerations used throughout the SDK.
ReductionType: MEAN, SUM for embedding reduction