Open In Colab Get Notebook View In GitHub

Using ESMFold#

This tutorial shows you how to use the ESMFold model to create a predicted 3D structure of your protein sequence of interest. We recommend using ESMFold with single-chain sequences. If you have a multi-chain sequence, please try Using AlphaFold2.

What you need before getting started#

Specify a sequence of interest whose structure you want to predict. The example used here is Interleukin 2:

[1]:
import openprotein

# Login to your session
session = openprotein.connect()

sequence = "MYRMQLLSCIALSLALVTNSAPTSSSTKKTQLQLEHLLLDLQMILNGINNYKNPKLTRMLTFKFYMPKKATELKHLQCLEEELKPLEEVLNLAQSKNFHLRPRDLISNINVIVLELKGMYRMQLLSCIALSLALVTNSAPTSSSTKKTQLQLEHLLLDLQMILNGINNYKNPKLTRMLTFKFYMPKKATELKHLQCLEEELKPLEEVLNLAQSKNFHLRPRDLISNINVIVLELKGSEP"

Getting the Model#

Create the model object for ESMFold:

[2]:
esmfoldmodel = session.fold.esmfold
esmfoldmodel.fold?
Signature:
esmfoldmodel.fold(
    sequences: Sequence[openprotein.molecules.complex.Complex | openprotein.molecules.protein.Protein | str | bytes],
    num_recycles: int | None = None,
) -> openprotein.fold.future.FoldResultFuture
Docstring:
Fold sequences using this model.

Parameters
----------
sequences : Sequence[bytes | str]
    sequences to fold
num_recycles : int | None
    number of times to recycle models
Returns
-------
    FoldResultFuture
File:      ~/Projects/openprotein/openprotein-python-private/openprotein/fold/esmfold.py
Type:      method

Predicting your sequence#

Call ESMFold on your sequence. The num_recycles hyperparameter allows the model to further refine structures using the previous cycle’s output as the new cycle’s input. This parameter accepts integers between 1 and 48.

Send the sequence of interest to ESM for folding.

Note that we can submit either a Complex, or Protein, or just the sequence itself which itself represents a single Protein. We can also submit a list of sequences to batch the request.

[3]:
esm = esmfoldmodel.fold([sequence.encode()], num_recycles=1)

esm
[3]:
FoldJob(num_records=1, job_id='f3370817-5fdd-400d-bd26-a90805cd9f4a', job_type=<JobType.embeddings_fold: '/embeddings/fold'>, status=<JobStatus.PENDING: 'PENDING'>, created_date=datetime.datetime(2026, 1, 16, 16, 18, 19, 489442, tzinfo=TzInfo(0)), start_date=None, end_date=None, prerequisite_job_id=None, progress_message=None, progress_counter=0, sequence_length=None)

Wait for the job to complete with wait_until_done():

[4]:
esm.wait_until_done(verbose=True, timeout=300)
Waiting: 100%|█████████████████████████████████████████████████| 100/100 [04:40<00:00,  2.80s/it, status=SUCCESS]
[4]:
True

Retrieving the Results#

Getting the Predicted Structure#

Fetch the results with get().

The results return a list of Structure which contains the 3D structures for each of the input in the request.

We can access the Complex for each Structure, and the folded Protein of interest.

[5]:
result = esm.get()
structure = result[0]
complex = structure[0]
protein = complex.get_protein("A") # auto-named alphabetical order

print("Predicted structure:", structure)
print("Predicted protein sequence:", protein.sequence)
Predicted structure: <openprotein.molecules.structure.Structure object at 0x7f563b329400>
Predicted protein sequence: b'MYRMQLLSCIALSLALVTNSAPTSSSTKKTQLQLEHLLLDLQMILNGINNYKNPKLTRMLTFKFYMPKKATELKHLQCLEEELKPLEEVLNLAQSKNFHLRPRDLISNINVIVLELKGMYRMQLLSCIALSLALVTNSAPTSSSTKKTQLQLEHLLLDLQMILNGINNYKNPKLTRMLTFKFYMPKKATELKHLQCLEEELKPLEEVLNLAQSKNFHLRPRDLISNINVIVLELKGSEP'

Visualize the structure using molviewspec:

[6]:
%pip install molviewspec
from molviewspec import create_builder

def display_structure(structure_string):
    builder = create_builder()
    structure = builder.download(url="mystructure.cif")\
        .parse(format="mmcif")\
        .model_structure()\
        .component()\
        .representation()\
        .color_from_source(schema="atom",
                            category_name="atom_site",
                            field_name="auth_asym_id",
                            palette={"kind": "categorical", # color by chain
                                    "colors": ["blue", "red", "green", "orange"],
                                    "mode": "ordinal"}
                          )
    return builder.molstar_notebook(data={'mystructure.cif': structure_string}, width=500, height=400)

display_structure(structure.to_string(format="cif"))
Requirement already satisfied: molviewspec in /home/jmage/Projects/openprotein/openprotein-python-private/.pixi/envs/dev/lib/python3.12/site-packages (1.7.0)
Requirement already satisfied: pydantic<3,>=1 in /home/jmage/Projects/openprotein/openprotein-python-private/.pixi/envs/dev/lib/python3.12/site-packages (from molviewspec) (2.12.5)
Requirement already satisfied: annotated-types>=0.6.0 in /home/jmage/Projects/openprotein/openprotein-python-private/.pixi/envs/dev/lib/python3.12/site-packages (from pydantic<3,>=1->molviewspec) (0.7.0)
Requirement already satisfied: pydantic-core==2.41.5 in /home/jmage/Projects/openprotein/openprotein-python-private/.pixi/envs/dev/lib/python3.12/site-packages (from pydantic<3,>=1->molviewspec) (2.41.5)
Requirement already satisfied: typing-extensions>=4.14.1 in /home/jmage/Projects/openprotein/openprotein-python-private/.pixi/envs/dev/lib/python3.12/site-packages (from pydantic<3,>=1->molviewspec) (4.15.0)
Requirement already satisfied: typing-inspection>=0.4.2 in /home/jmage/Projects/openprotein/openprotein-python-private/.pixi/envs/dev/lib/python3.12/site-packages (from pydantic<3,>=1->molviewspec) (0.4.2)
Note: you may need to restart the kernel to use updated packages.

Getting the PAE#

ESMFold provides the PAE (Predicted Aligned Error), which is an N × N matrix estimating the expected error between pairs of residues, useful for assessing relative positions (e.g., domains or chains).

[7]:
# Retrieve the PAE matrix
pae_matrix = esm.get_pae()[0] # note that the pae result is also a list
print("\nPAE matrix shape:", pae_matrix.shape)

PAE matrix shape: (239, 239)

Next steps#

Use the predicted structure to compare with query structure, or try another structure predictor like AlphaFold2 or save your structure for future use:

[8]:
with open("esmfold_prediction.cif", "w") as f:
    f.write(structure.to_string(format="cif"))