Overview¶
This section provides an overview of available tasks, models and benchmarks in MTEB.
-
Benchmarks
All the popular benchmarks for evaluating embeddings in one place
-
Models
Reproducle models implementations, for any modality and language.
-
Tasks
Our comprehensive collection of tasks for evaluating embeddings
Models¶
-
Text
Models that only encode text into embeddings.
-
Multimodal
Models that encode more than two modalities.
-
Image
Models that only encode images into embeddings.
-
Image Text
Models that jointly encode images and text.
-
Audio
Models that only encode audio into embeddings.
-
Audio Text
Models that jointly encode audio and text.
Tasks¶
While MTEB covers multiple task types, we categorize them into 5 broad categories based on the type of evaluation they require, these categories are not mutually exclusive but provide a useful way to navigate the large collection of tasks in MTEB.
-
Retrieval
Asymmetric matching between queries and a corpus across different embedding regions.
-
Classification
Embeddings linearly separable by category, evaluated using a classifier or regression probe.
-
Clustering
Globally coherent embeddings where distances reflect semantic grouping.
-
Semantic similarity
Fine-grained similarity between item pairs, where cosine similarity reflects human judgments.
-
Pair classification
Embeddings capturing relationships between item pairs, such as entailment or paraphrase.