Skip to content

Additional Types

MTEB implements a variety of utility types to allow us and you to better know what a model returns. This page documents some of these types.

Encoder Input/Output types

Array = np.ndarray | torch.Tensor module-attribute

General array type, can be a numpy array or a torch tensor.

Conversation = list[ConversationTurn] module-attribute

A conversation, consisting of a list of messages.

BatchedInput = TextInput | CorpusInput | QueryInput | ImageInput | AudioInput | MultimodalInput module-attribute

Represents the input format accepted by the encoder for a batch of data.

The encoder can process several input types depending on the task or modality. Each type is defined as a separate structured input with its own fields.

Supported input types
  1. TextInput For pure text inputs.

{"text": ["This is a sample text.", "Another text."]}
2. CorpusInput For corpus-style inputs with titles and bodies.

{"text": ["Title 1 Body 1", "Title 2 Body 2"], "title": ["Title 1", "Title 2"], "body": ["Body 1", "Body 2"]}
3. QueryInput For query–instruction pairs, typically used in retrieval or question answering tasks. Queries and instructions are combined with the model's instruction template.

{
    "text": ["Instruction: Your task is to find document for this query. Query: What is AI?", "Instruction: Your task is to find term for definition. Query: Define machine learning."],
    "query": ["What is AI?", "Define machine learning."],
    "instruction": ["Your task is find document for this query.", "Your task is to find term for definition."]
}
4. ImageInput For visual inputs consisting of images.

{"image": [PIL.Image1, PIL.Image2]}
5. MultimodalInput For combined text–image (multimodal) inputs.

{"text": ["This is a sample text."], "image": [PIL.Image1]}

TextBatchedInput = TextInput | CorpusInput | QueryInput module-attribute

The input to the encoder for a batch of text data.

QueryDatasetType = Dataset module-attribute

Retrieval query dataset, containing queries. Should have columns id, text.

CorpusDatasetType = Dataset module-attribute

Retrieval corpus dataset, containing documents. Should have columns id, title, body.

InstructionDatasetType = Dataset module-attribute

Retrieval instruction dataset, containing instructions. Should have columns query-id, instruction.

RelevantDocumentsType = Mapping[str, Mapping[str, int]] module-attribute

Relevant documents for each query, mapping query IDs to a mapping of document IDs and their relevance scores. Should have columns query-id, corpus-id, score.

TopRankedDocumentsType = Mapping[str, list[str]] module-attribute

Top-ranked documents for each query, mapping query IDs to a list of document IDs. Should have columns query-id, corpus-ids.

RetrievalOutputType = dict[str, dict[str, float]] module-attribute

Retrieval output, containing the scores for each query-document pair.

PromptType

Bases: str, Enum

The type of prompt used in the input for retrieval models. Used to differentiate between queries and documents.

Attributes:

Name Type Description
query

A prompt that is a query.

document

A prompt that is a document.

Source code in mteb/types/_encoder_io.py
17
18
19
20
21
22
23
24
25
26
class PromptType(str, Enum):
    """The type of prompt used in the input for retrieval models. Used to differentiate between queries and documents.

    Attributes:
        query: A prompt that is a query.
        document: A prompt that is a document.
    """

    query = "query"
    document = "document"

ConversationTurn

Bases: TypedDict

A conversation, consisting of a list of messages.

Attributes:

Name Type Description
role str

The role of the message sender.

content str

The content of the message.

Source code in mteb/types/_encoder_io.py
29
30
31
32
33
34
35
36
37
38
class ConversationTurn(TypedDict):
    """A conversation, consisting of a list of messages.

    Attributes:
        role: The role of the message sender.
        content: The content of the message.
    """

    role: str
    content: str

TextInput

Bases: TypedDict

The input to the encoder for text.

Attributes:

Name Type Description
text list[str]

The text to encode. Can be a list of texts or a list of lists of texts.

Source code in mteb/types/_encoder_io.py
45
46
47
48
49
50
51
52
class TextInput(TypedDict):
    """The input to the encoder for text.

    Attributes:
        text: The text to encode. Can be a list of texts or a list of lists of texts.
    """

    text: list[str]

CorpusInput

Bases: TextInput

The input to the encoder for retrieval corpus.

Attributes:

Name Type Description
title list[str]

The title of the text to encode. Can be a list of titles or a list of lists of titles.

body list[str]

The body of the text to encode. Can be a list of bodies or a list of lists of bodies.

Source code in mteb/types/_encoder_io.py
55
56
57
58
59
60
61
62
63
64
65
66
class CorpusInput(TextInput):
    """The input to the encoder for retrieval corpus.

    Attributes:
        title: The title of the text to encode. Can be a list of titles or a
            list of lists of titles.
        body: The body of the text to encode. Can be a list of bodies or a
            list of lists of bodies.
    """

    title: list[str]
    body: list[str]

QueryInput

Bases: TextInput

The input to the encoder for queries.

Attributes:

Name Type Description
query list[str]

The query to encode. Can be a list of queries or a list of lists of queries.

conversation NotRequired[list[Conversation]]

Optional. A list of conversations, each conversation is a list of messages.

instruction NotRequired[list[str]]

Optional. A list of instructions to encode.

Source code in mteb/types/_encoder_io.py
69
70
71
72
73
74
75
76
77
78
79
80
class QueryInput(TextInput):
    """The input to the encoder for queries.

    Attributes:
        query: The query to encode. Can be a list of queries or a list of lists of queries.
        conversation: Optional. A list of conversations, each conversation is a list of messages.
        instruction: Optional. A list of instructions to encode.
    """

    query: list[str]
    conversation: NotRequired[list[Conversation]]
    instruction: NotRequired[list[str]]

ImageInput

Bases: TypedDict

The input to the encoder for images.

Attributes:

Name Type Description
image list[Image]

The image to encode. Can be a list of images or a list of lists of images.

Source code in mteb/types/_encoder_io.py
83
84
85
86
87
88
89
90
class ImageInput(TypedDict):
    """The input to the encoder for images.

    Attributes:
        image: The image to encode. Can be a list of images or a list of lists of images.
    """

    image: list[Image.Image]

AudioInput

Bases: TypedDict

The input to the encoder for audio.

Attributes:

Name Type Description
audio list[list[bytes]]

The audio to encode. Can be a list of audio files or a list of lists of audio files.

Source code in mteb/types/_encoder_io.py
 93
 94
 95
 96
 97
 98
 99
100
class AudioInput(TypedDict):
    """The input to the encoder for audio.

    Attributes:
        audio: The audio to encode. Can be a list of audio files or a list of lists of audio files.
    """

    audio: list[list[bytes]]

MultimodalInput

Bases: TextInput, CorpusInput, QueryInput, ImageInput, AudioInput

The input to the encoder for multimodal data.

Source code in mteb/types/_encoder_io.py
103
104
105
106
class MultimodalInput(TextInput, CorpusInput, QueryInput, ImageInput, AudioInput):  # type: ignore[misc]
    """The input to the encoder for multimodal data."""

    pass

Metadata types

ISOLanguageScript = str module-attribute

A string representing the language and script. Language is denoted as a 3-letter ISO 639-3 language code and the script is denoted by a 4-letter ISO 15924 script code (e.g. "eng-Latn").

ISOLanguage = str module-attribute

A string representing the language. Language is denoted as a 3-letter ISO 639-3 language code (e.g. "eng").

ISOScript = str module-attribute

A string representing the script. The script is denoted by a 4-letter ISO 15924 script code (e.g. "Latn").

Languages = list[ISOLanguageScript] | Mapping[HFSubset, list[ISOLanguageScript]] module-attribute

A list of languages or a mapping from HFSubset to a list of languages. E.g. ["eng-Latn", "deu-Latn"] or {"en-de": ["eng-Latn", "deu-Latn"], "fr-it": ["fra-Latn", "ita-Latn"]}.

Licenses = Literal['not specified', 'mit', 'cc-by-2.0', 'cc-by-3.0', 'cc-by-4.0', 'cc-by-sa-3.0', 'cc-by-sa-4.0', 'cc-by-nc-3.0', 'cc-by-nc-4.0', 'cc-by-nc-sa-3.0', 'cc-by-nc-sa-4.0', 'cc-by-nc-nd-4.0', 'cc-by-nd-4.0', 'openrail', 'openrail++', 'odc-by', 'afl-3.0', 'apache-2.0', 'cc-by-nd-2.1-jp', 'cc0-1.0', 'bsd-3-clause', 'gpl-3.0', 'lgpl', 'lgpl-3.0', 'cdla-sharing-1.0', 'mpl-2.0', 'msr-la-nc', 'multiple', 'gemma'] module-attribute

The different licenses that a dataset or model can have. This list can be extended as needed.

ModelName = str module-attribute

The name of a model, typically as found on HuggingFace e.g. sentence-transformers/all-MiniLM-L6-v2.

Revision = str module-attribute

The revision of a model, typically a git commit hash. For APIs this can be a version string e.g. 1.

Modalities = Literal['text', 'image'] module-attribute

The different modalities that a model can support.

Results types

HFSubset = str module-attribute

The name of a HuggingFace dataset subset, e.g. 'en-de', 'en', 'default' (default is used when there is no subset).

SplitName = str module-attribute

The name of a data split, e.g. 'test', 'validation', 'train'.

Score = Any module-attribute

A score value, could e.g. be accuracy. Normally it is a float or int, but it can take on any value. Should be json serializable.

ScoresDict = dict[str, Score] module-attribute

A dictionary of scores, typically also include metadata, e.g {'main_score': 0.5, 'accuracy': 0.5, 'f1': 0.6, 'hf_subset': 'en-de', 'languages': ['eng-Latn', 'deu-Latn']}

RetrievalEvaluationResult

Bases: NamedTuple

Holds the results of retrieval evaluation metrics.

Source code in mteb/types/_result.py
15
16
17
18
19
20
21
22
23
24
25
26
class RetrievalEvaluationResult(NamedTuple):
    """Holds the results of retrieval evaluation metrics."""

    all_scores: dict[str, dict[str, float]]
    ndcg: dict[str, float]
    map: dict[str, float]
    recall: dict[str, float]
    precision: dict[str, float]
    naucs: dict[str, float]
    mrr: dict[str, float]
    naucs_mrr: dict[str, float]
    cv_recall: dict[str, float]

Statistics types

SplitDescriptiveStatistics

Bases: TypedDict

Base class for descriptive statistics for the subset.

Source code in mteb/types/statistics.py
6
7
8
9
class SplitDescriptiveStatistics(TypedDict):
    """Base class for descriptive statistics for the subset."""

    pass

DescriptiveStatistics

Bases: TypedDict, SplitDescriptiveStatistics

Class for descriptive statistics for the full task.

Source code in mteb/types/statistics.py
12
13
14
15
class DescriptiveStatistics(TypedDict, SplitDescriptiveStatistics):
    """Class for descriptive statistics for the full task."""

    hf_subset_descriptive_stats: NotRequired[dict[HFSubset, SplitDescriptiveStatistics]]

TextStatistics

Bases: TypedDict

Class for descriptive statistics for texts.

Attributes:

Name Type Description
total_text_length int

Total length of all texts

min_text_length int

Minimum length of text

average_text_length float

Average length of text

max_text_length int

Maximum length of text

unique_texts int

Number of unique texts

Source code in mteb/types/statistics.py
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
class TextStatistics(TypedDict):
    """Class for descriptive statistics for texts.

    Attributes:
        total_text_length: Total length of all texts
        min_text_length: Minimum length of text
        average_text_length: Average length of text
        max_text_length: Maximum length of text
        unique_texts: Number of unique texts
    """

    total_text_length: int
    min_text_length: int
    average_text_length: float
    max_text_length: int
    unique_texts: int

ImageStatistics

Bases: TypedDict

Class for descriptive statistics for images.

Attributes:

Name Type Description
min_image_width float

Minimum width of images

average_image_width float

Average width of images

max_image_width float

Maximum width of images

min_image_height float

Minimum height of images

average_image_height float

Average height of images

max_image_height float

Maximum height of images

unique_images int

Number of unique images

Source code in mteb/types/statistics.py
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
class ImageStatistics(TypedDict):
    """Class for descriptive statistics for images.

    Attributes:
        min_image_width: Minimum width of images
        average_image_width: Average width of images
        max_image_width: Maximum width of images

        min_image_height: Minimum height of images
        average_image_height: Average height of images
        max_image_height: Maximum height of images

        unique_images: Number of unique images
    """

    min_image_width: float
    average_image_width: float
    max_image_width: float

    min_image_height: float
    average_image_height: float
    max_image_height: float

    unique_images: int

LabelStatistics

Bases: TypedDict

Class for descriptive statistics for texts.

Attributes:

Name Type Description
min_labels_per_text int

Minimum number of labels per text

average_label_per_text float

Average number of labels per text

max_labels_per_text int

Maximum number of labels per text

unique_labels int

Number of unique labels

labels dict[str, dict[str, int]]

dict of label frequencies

Source code in mteb/types/statistics.py
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
class LabelStatistics(TypedDict):
    """Class for descriptive statistics for texts.

    Attributes:
        min_labels_per_text: Minimum number of labels per text
        average_label_per_text: Average number of labels per text
        max_labels_per_text: Maximum number of labels per text

        unique_labels: Number of unique labels
        labels: dict of label frequencies
    """

    min_labels_per_text: int
    average_label_per_text: float
    max_labels_per_text: int

    unique_labels: int
    labels: dict[str, dict[str, int]]

ScoreStatistics

Bases: TypedDict

Class for descriptive statistics for texts.

Attributes:

Name Type Description
min_score int

Minimum score

avg_score float

Average score

max_score int

Maximum score

Source code in mteb/types/statistics.py
82
83
84
85
86
87
88
89
90
91
92
93
class ScoreStatistics(TypedDict):
    """Class for descriptive statistics for texts.

    Attributes:
        min_score: Minimum score
        avg_score: Average score
        max_score: Maximum score
    """

    min_score: int
    avg_score: float
    max_score: int

TopRankedStatistics

Bases: TypedDict

Statistics for top ranked documents in a retrieval task.

Attributes:

Name Type Description
num_top_ranked int

Total number of top ranked documents across all queries.

min_top_ranked_per_query int

Minimum number of top ranked documents for any query.

average_top_ranked_per_query float

Average number of top ranked documents per query.

max_top_ranked_per_query int

Maximum number of top ranked documents for any query.

Source code in mteb/types/statistics.py
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
class TopRankedStatistics(TypedDict):
    """Statistics for top ranked documents in a retrieval task.

    Attributes:
        num_top_ranked: Total number of top ranked documents across all queries.
        min_top_ranked_per_query: Minimum number of top ranked documents for any query.
        average_top_ranked_per_query: Average number of top ranked documents per query.
        max_top_ranked_per_query: Maximum number of top ranked documents for any query.
    """

    num_top_ranked: int
    min_top_ranked_per_query: int
    average_top_ranked_per_query: float
    max_top_ranked_per_query: int

RelevantDocsStatistics

Bases: TypedDict

Statistics for relevant documents in a retrieval task.

Attributes:

Name Type Description
num_relevant_docs int

Total number of relevant documents across all queries.

min_relevant_docs_per_query int

Minimum number of relevant documents for any query.

average_relevant_docs_per_query float

Average number of relevant documents per query.

max_relevant_docs_per_query float

Maximum number of relevant documents for any query.

unique_relevant_docs int

Number of unique relevant documents across all queries.

Source code in mteb/types/statistics.py
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
class RelevantDocsStatistics(TypedDict):
    """Statistics for relevant documents in a retrieval task.

    Attributes:
        num_relevant_docs: Total number of relevant documents across all queries.
        min_relevant_docs_per_query: Minimum number of relevant documents for any query.
        average_relevant_docs_per_query: Average number of relevant documents per query.
        max_relevant_docs_per_query: Maximum number of relevant documents for any query.
        unique_relevant_docs: Number of unique relevant documents across all queries.
    """

    num_relevant_docs: int
    min_relevant_docs_per_query: int
    average_relevant_docs_per_query: float
    max_relevant_docs_per_query: float
    unique_relevant_docs: int