Skip to content

Overview

This section provides an overview of available tasks, models and benchmarks in MTEB.

  • Benchmarks


    All the popular benchmarks for evaluating embeddings in one place

    See benchmarks

  • Models


    Reproducle models implementations, for any modality and language.

    See models

  • Tasks


    Our comprehensive collection of tasks for evaluating embeddings

    See tasks

Models

Tasks

While MTEB covers multiple task types, we categorize them into 5 broad categories based on the type of evaluation they require, these categories are not mutually exclusive but provide a useful way to navigate the large collection of tasks in MTEB.

  • Retrieval


    Asymmetric matching between queries and a corpus across different embedding regions.

    See tasks

  • Classification


    Embeddings linearly separable by category, evaluated using a classifier or regression probe.

    See tasks

  • Clustering


    Globally coherent embeddings where distances reflect semantic grouping.

    See tasks

  • Semantic similarity


    Fine-grained similarity between item pairs, where cosine similarity reflects human judgments.

    See tasks

  • Pair classification


    Embeddings capturing relationships between item pairs, such as entailment or paraphrase.

    See tasks