Quickstart#

All interactions with OpenProtein are performed through a OpenProtein session. A session encapsulates authentication and provides access to all available APIs and workflows.

Creating a session#

To create a session, you must authenticate using your OpenProtein credentials. The connect() function resolves credentials using the following order of precedence:

  1. Explicit arguments passed to connect()

  2. Environment variables

  3. A configuration file at ~/.openprotein/config.toml

1. Explicit credentials

import openprotein

session = openprotein.connect(
    username="username",
    password="password",
)

2. Environment variables

Set the following variables in your shell (or via %env in Jupyter):

  • OPENPROTEIN_USERNAME

  • OPENPROTEIN_PASSWORD

session = openprotein.connect()

3. Configuration file

Create ~/.openprotein/config.toml with the following contents:

username = "username"
password = "password"

Then simply call:

session = openprotein.connect()

Note

For security and reproducibility, we recommend using environment variables or a configuration file rather than embedding credentials directly in code.

Using the session#

Once connected, the session provides access to all OpenProtein APIs.

For example, to upload a dataset:

For example, upload your dataset with

session.data.create(...)

or create an MSA using homology search with

session.align.create_msa(...)

Job System#

The OpenProtein.AI platform operates with an asynchronous framework. When initiating a task using our Python client, the system schedules the job, returning a prompt response with a unique Job ID. This mechanism ensures that tasks requiring longer processing times do not necessitate immediate waiting.

When you submit a task, such as using the method

session.align.create_msa(...)

a Future object is returned for results tracking and access. You can check a job’s status using the refresh() and done() methods on this object. If you wish to wait for the results, you can use the wait() method, or the get() method if the results are already completed.

In addition, you can resume a workflow using the session.jobs.get function along with the unique job ID obtained during task execution. This method will return a Future Class, allowing you to continue from where you left off.

For example, for a homology search workflow:

# 1. Create the MSA job
msa_job = session.align.create_msa(...)

...

# 2. Retrieve the MSA job
msa_job = session.jobs.get("f989befa-5fb2-43e1-b8d0-bb070601ceec")

# 3. Wait for completion
msa_job.wait_until_done(
    # verbose=True, # poll for progress
    # timeout=60*60, # limit the time to wait in seconds
)

# 4. Retrieve results
msa_results = msa_job.get()

# 5. Or combine step 3 and 4 with `wait`
msa_results = msa_job.wait()

# 6. Or use the future directly with a sample prompt to use with PoET
prompt_job = msa_job.sample_prompt(msa_job)