Skip to content

Submit Results

Overview

The ResultCache class manages evaluation results locally and submits them to the official results repository. Use it to cache results, avoid re-computation, and contribute results back to the community.

Loading Results

For a full guide on loading and working with results — including filtering, dataframe conversion, and benchmark scoring — see Loading Results.

Quick Start

Complete example: evaluate, cache, and submit results:

import mteb

# 1. Initialize cache
cache = mteb.ResultCache()

# 2. Evaluate model
model_meta = mteb.get_model_meta("sentence-transformers/all-MiniLM-L6-v2")
task = mteb.get_task("ArguAna")

mteb.evaluate(model_meta, task, cache=cache)

# 3. Submit results
cache.submit_results(model_meta, create_pr=False)  # manual review before pushing

Submitting Results

Note

Git is required for this action.

Prepare results without automatically creating a PR:

submission_info = cache.submit_results(
    models=["sentence-transformers/all-MiniLM-L6-v2"],
    create_pr=False
)

# submit_results logs the manual submission instructions
print(f"Prepared submission at: {submission_info['path']}")

Note

Git, GitHub CLI are required for this action. You also need to install the mteb[github] extra dependencies and configure GitHub integration by signing in with gh auth login or setting up your Git credential helper.

pip install mteb[github]
uv pip install mteb[github]

Then run your code:

submission_info = cache.submit_results(
    models=["sentence-transformers/all-MiniLM-L6-v2"],
    create_pr=True
)

if submission_info.get("pr_url"):
    print(f"PR created: {submission_info['pr_url']}")

Batch Submission

Submit multiple models at once:

models = [
    "sentence-transformers/all-MiniLM-L6-v2",
    "sentence-transformers/all-mpnet-base-v2",
    "BAAI/bge-base-en-v1.5"
]

cache.submit_results(models=models, create_pr=False)

After submission

Once the PR is created your result will now wait for review, we aim for this to take less than a week. To speed up the review please make sure the fill out the checklist. During the review process we might ask you about suspicious results or ask you to check for potential data leakage.

API Reference