Overview¶

This section provides an overview of available tasks, models and benchmarks in MTEB.

Benchmarks

All the popular benchmarks for evaluating embeddings in one place

See benchmarks
Models

Reproducle models implementations, for any modality and language.

See models
Tasks

Our comprehensive collection of tasks for evaluating embeddings

See tasks

Models¶

Text

Models that only encode text into embeddings.

See text models
Multimodal

Models that encode more than two modalities.

See multimodal models
Image

Models that only encode images into embeddings.

See image models
Image Text

Models that jointly encode images and text.

See image text models
Audio

Models that only encode audio into embeddings.

See audio models
Audio Text

Models that jointly encode audio and text.

See audio text models

Tasks¶

While MTEB covers multiple task types, we categorize them into 5 broad categories based on the type of evaluation they require, these categories are not mutually exclusive but provide a useful way to navigate the large collection of tasks in MTEB.

Retrieval

Asymmetric matching between queries and a corpus across different embedding regions.

See tasks
Classification

Embeddings linearly separable by category, evaluated using a classifier or regression probe.

See tasks
Clustering

Globally coherent embeddings where distances reflect semantic grouping.

See tasks
Semantic similarity

Fine-grained similarity between item pairs, where cosine similarity reflects human judgments.

See tasks
Pair classification

Embeddings capturing relationships between item pairs, such as entailment or paraphrase.

See tasks