Modules
sentence_transformers.base.modules defines different building blocks, a.k.a. Modules, that can be used to create models from scratch.
Common Modules
- class sentence_transformers.base.modules.Transformer(model_name_or_path: str, *, transformer_task: Literal['feature-extraction', 'sequence-classification', 'text-generation', 'any-to-any', 'fill-mask'] = 'feature-extraction', model_kwargs: dict[str, Any] | None = None, processor_kwargs: dict[str, Any] | None = None, config_kwargs: dict[str, Any] | None = None, processing_kwargs: ProcessingKwargs | None = None, backend: Literal['torch', 'onnx', 'openvino'] = 'torch', modality_config: dict[Literal['text', 'image', 'audio', 'video', 'message'] | tuple[Literal['text', 'image', 'audio', 'video'], ...], ModalityParams] | None = None, module_output_name: str | None = None, unpad_inputs: bool | None = None, max_seq_length: int | None = None, do_lower_case: bool = False, tokenizer_name_or_path: str | None = None)[source]
Hugging Face AutoModel wrapper that handles loading, preprocessing, and inference.
Loads the appropriate model class (e.g. BERT, RoBERTa, CLIP, Whisper) based on the model configuration and the specified
transformer_task. Supports text, image, audio, and video modalities depending on the underlying model. This module is typically the first module in aSentenceTransformer,SparseEncoder, orCrossEncoderpipeline.- Parameters:
model_name_or_path (str) – Hugging Face model name or path to a local model directory.
transformer_task (str, optional) –
The task determining which
AutoModel-like class to load. Supported values:"feature-extraction"(default):AutoModel, e.g. used bySentenceTransformer."sequence-classification":AutoModelForSequenceClassification, e.g. used byCrossEncoder."text-generation":AutoModelForCausalLM, e.g. used by generativeCrossEncodermodels. Sets thetokenizerpadding_side to “left”."any-to-any":AutoModelForMultimodalLM, e.g. used by multimodal generativeCrossEncodermodels (requires transformers v5+). Sets thetokenizerpadding_side to “left”."fill-mask":AutoModelForMaskedLM, e.g. used bySparseEncoder.
Defaults to
"feature-extraction".model_kwargs (dict[str, Any], optional) –
Keyword arguments forwarded to
AutoModel.from_pretrainedwhen loading the model. Particularly useful options include:torch_dtype: Override the defaulttorch.dtypeand load the model under a specific dtype. Can betorch.float16,torch.bfloat16,torch.float32, or"auto"to use the dtype from the model’sconfig.json.attn_implementation: The attention implementation to use. For example"eager","sdpa", or"flash_attention_2". If youpip install kernels, then"flash_attention_2"should work without having to installflash_attn. It is frequently the fastest option. Defaults to"sdpa"when available (torch>=2.1.1).device_map: Device map for model parallelism, e.g."auto".provider: Forbackend="onnx", the ONNX execution provider (e.g."CUDAExecutionProvider").file_name: Forbackend="onnx"or"openvino", the filename to load (e.g. for optimized or quantized models).export: Forbackend="onnx"or"openvino", whether to export the model to the backend format. Also set automatically if the exported file doesn’t exist.
See the PreTrainedModel.from_pretrained documentation for more details. Defaults to None.
processor_kwargs (dict[str, Any], optional) – Keyword arguments forwarded to
AutoProcessor.from_pretrainedwhen loading the processor/tokenizer. See the AutoTokenizer.from_pretrained documentation for more details. Defaults to None.config_kwargs (dict[str, Any], optional) – Keyword arguments forwarded to
AutoConfig.from_pretrainedwhen loading the config. See the AutoConfig.from_pretrained documentation for more details. Defaults to None.processing_kwargs (dict[str, dict[str, Any]], optional) – Keyword arguments applied when calling the processor during preprocessing. This is a nested dict whose keys are modality names (
"text","audio","image","video"),"common"for kwargs shared across all modalities, or"chat_template"for kwargs forwarded toapply_chat_template(e.g.{"add_generation_prompt": True}). Modality and common kwargs override the built-in defaults. Saved to and loaded from the model configuration file. Defaults to None.backend (str, optional) – Backend used for model inference. Can be
"torch"(default),"onnx", or"openvino". Defaults to"torch".modality_config (dict, optional) – Custom modality configuration mapping modality names to method and output name dicts. When provided,
module_output_namemust also be set. The"message"modality entry may include a"format"key ("structured","flat", or"auto") to control how chat-template inputs are formatted. Defaults to None.module_output_name (str, optional) – The name of the output feature this module creates (e.g.
"token_embeddings","scores"). Required whenmodality_configis provided. Defaults to None.unpad_inputs (bool, optional) – Controls whether text-only inputs are concatenated without padding for faster inference using flash attention’s variable-length functions. Non-text inputs (images, audio, video) are always padded normally. If
None(default), unpadding is enabled automatically when all prerequisites are met (flash attention with variable-length support,"torch"backend,"feature-extraction"task). Set toFalseto force padding, which is needed for architectures that don’t support unpadded inputs (e.g.qwen2_vl). Set toTrueto request unpadding explicitly; a warning is logged if the prerequisites are not met. Defaults to None.max_seq_length (int, optional) – Truncate any inputs longer than this value. Prefer setting
model_max_lengthviaprocessor_kwargsinstead. Defaults to None.do_lower_case (bool, optional) – If true, lowercases the input (independent of whether the model is cased or not). Rarely needed. Defaults to False.
tokenizer_name_or_path (str, optional) – Name or path of the tokenizer. When None,
model_name_or_pathis used. Deprecated. Defaults to None.
- get_embedding_dimension() int[source]
Get the output embedding dimension from the transformer model.
- Returns:
The hidden dimension size of the model’s embeddings.
- Return type:
int
- Raises:
ValueError – If the embedding dimension cannot be determined from the model config.
- property max_seq_length: int | None
The maximum input sequence length. Reads from the tokenizer if available, otherwise falls back to
max_position_embeddingsfrom the model config.
- property modalities: list[Literal['text', 'image', 'audio', 'video', 'message'] | tuple[Literal['text', 'image', 'audio', 'video'], ...]]
The list of supported input modalities (e.g.
"text","image",("image", "text")).
- preprocess(inputs: list[str | Image | ndarray | Tensor | AudioDict | None | VideoDict | MessageDict | list[MessageDict] | dict[Literal['text', 'image', 'audio', 'video'], str | Image | ndarray | Tensor | AudioDict | None | VideoDict] | tuple[str | Image | ndarray | Tensor | AudioDict | None | VideoDict | dict[Literal['text', 'image', 'audio', 'video'], str | Image | ndarray | Tensor | AudioDict | None | VideoDict], str | Image | ndarray | Tensor | AudioDict | None | VideoDict | dict[Literal['text', 'image', 'audio', 'video'], str | Image | ndarray | Tensor | AudioDict | None | VideoDict]] | list[str | Image | ndarray | Tensor | AudioDict | None | VideoDict | dict[Literal['text', 'image', 'audio', 'video'], str | Image | ndarray | Tensor | AudioDict | None | VideoDict]]], prompt: str | None = None, **kwargs) dict[str, Any][source]
Preprocess inputs into model-ready features.
- Parameters:
inputs – List of inputs. Can contain strings, dicts with modality keys, PIL images, or numpy/torch arrays for audio/video.
prompt – Optional prompt to prepend to text inputs or inject as a system message.
**kwargs – Additional keyword arguments forwarded to prompt length computation (e.g.
task). Only used whenpromptis provided for text inputs.
- Returns:
Dictionary containing preprocessed tensors with a
modalitykey indicating the input type and optionally aprompt_lengthkey for prompt-aware pooling.
- class sentence_transformers.base.modules.Dense(in_features: int, out_features: int, bias: bool = True, activation_function: Callable[[Tensor], Tensor] | None = Tanh(), init_weight: Tensor | None = None, init_bias: Tensor | None = None, module_input_name: str = 'sentence_embedding', module_output_name: str | None = None)[source]
Applies a linear transformation with an optional activation function.
Passes the embedding through a feed-forward layer (
nn.Linear+ activation), useful for dimensionality reduction or projecting embeddings into a different space.- Parameters:
in_features – Size of the input dimension.
out_features – Size of the output dimension.
bias – Whether to include a bias vector in the linear layer.
activation_function – Activation function applied after the linear layer. If
None, usesnn.Identity(). Defaults tonn.Tanh().init_weight – Initial value for the weight matrix of the linear layer.
init_bias – Initial value for the bias vector of the linear layer.
module_input_name – The key in the features dictionary to read the input from. Defaults to
"sentence_embedding".module_output_name – The key in the features dictionary to store the output in. If
None, uses the same key asmodule_input_name.
- class sentence_transformers.base.modules.Router(sub_modules: dict[str, list[Module]], default_route: str | None = None, allow_empty_key: bool = True, route_mappings: dict[tuple[str | None, str | tuple[str, ...] | None], str] | None = None)[source]
This model allows creating flexible SentenceTransformer models that dynamically route inputs to different processing modules based on:
Task type (e.g., “query” or “document”) for asymmetric retrieval models
Modality (e.g., “text”, “image”, or (“text”, “image”)) for crossmodal or multimodal models
Combination of both for complex routing scenarios
Tips:
The
taskargument inmodel.encode()specifies which route to usemodel.encode_query()andmodel.encode_document()are convenient shorthands fortask="query"andtask="document"Modality is automatically inferred from input data (text strings, PIL Images, etc.)
You can override automatic inference by passing
modalityinmodel.encode()(and its variants) explicitly
Route Priority:
Exact match:
(task, modality)- e.g.,("query", "text")Task with any modality:
(task, None)- e.g.,("query", None)Any task with modality:
(None, modality)- e.g.,(None, "image")Catch-all:
(None, None)Direct lookup by task name in
sub_modulesDirect lookup by modality name in
sub_modulesFall back to
default_routeif set
In the below examples, the
Routermodel is used to create asymmetric models with different encoders for queries and documents. In these examples, the “query” route is efficient (e.g., using SparseStaticEmbedding), while the “document” route uses a more complex model (e.g. a Transformers module). This allows for efficient query encoding while still using a powerful document encoder, but the combinations are not limited to this.Example
from sentence_transformers import SentenceTransformer from sentence_transformers.sentence_transformer.modules import Router, Normalize # Use a regular SentenceTransformer for the document embeddings, and a static embedding model for the query embeddings document_embedder = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1") query_embedder = SentenceTransformer("sentence-transformers/static-retrieval-mrl-en-v1") router = Router.for_query_document( query_modules=list(query_embedder.children()), document_modules=list(document_embedder.children()), ) normalize = Normalize() # Create an asymmetric model with different encoders for queries and documents model = SentenceTransformer( modules=[router, normalize], ) # ... requires more training to align the vector spaces # Use the query & document routes query_embedding = model.encode_query("What is the capital of France?") document_embedding = model.encode_document("Paris is the capital of France.")
from sentence_transformers.sparse_encoder.modules import Router, SparseStaticEmbedding, SpladePooling, Transformer from sentence_transformers.sparse_encoder import SparseEncoder # Load an asymmetric model with different encoders for queries and documents doc_encoder = Transformer("opensearch-project/opensearch-neural-sparse-encoding-doc-v3-distill", transformer_task="fill-mask") router = Router.for_query_document( query_modules=[ SparseStaticEmbedding.from_json( "opensearch-project/opensearch-neural-sparse-encoding-doc-v3-distill", tokenizer=doc_encoder.tokenizer, frozen=True, ), ], document_modules=[ doc_encoder, SpladePooling(pooling_strategy="max", activation_function="log1p_relu"), ], ) model = SparseEncoder(modules=[router], similarity_fn_name="dot") query = "What's the weather in ny now?" document = "Currently New York is rainy." query_embed = model.encode_query(query) document_embed = model.encode_document(document) sim = model.similarity(query_embed, document_embed) print(f"Similarity: {sim}") # Visualize top tokens for each text top_k = 10 print(f"Top tokens {top_k} for each text:") decoded_query = model.decode(query_embed, top_k=top_k) decoded_document = model.decode(document_embed) for i in range(min(top_k, len(decoded_query))): query_token, query_score = decoded_query[i] doc_score = next((score for token, score in decoded_document if token == query_token), 0) if doc_score != 0: print(f"Token: {query_token}, Query score: {query_score:.4f}, Document score: {doc_score:.4f}") ''' Similarity: tensor([[11.1105]], device='cuda:0') Top tokens 10 for each text: Token: ny, Query score: 5.7729, Document score: 0.8049 Token: weather, Query score: 4.5684, Document score: 0.9710 Token: now, Query score: 3.5895, Document score: 0.4720 Token: ?, Query score: 3.3313, Document score: 0.0286 Token: what, Query score: 2.7699, Document score: 0.0787 Token: in, Query score: 0.4989, Document score: 0.0417 '''
Multimodal Example:
from PIL import Image from sentence_transformers import SentenceTransformer from sentence_transformers.sentence_transformer.modules import Dense, Pooling, Router, Transformer # Create separate encoders for different modalities text_encoder = Transformer("sentence-transformers/all-MiniLM-L6-v2") # Project to 768 dims to match image encoder text_dense = Dense(text_encoder.get_embedding_dimension(), 768, module_input_name="token_embeddings") image_encoder = Transformer( "ModernVBERT/modernvbert", model_kwargs={"trust_remote_code": True}, processor_kwargs={"trust_remote_code": True}, config_kwargs={"trust_remote_code": True}, ) pooling = Pooling(text_encoder.get_embedding_dimension()) # Route based on modality router = Router( sub_modules={ "text": [text_encoder, text_dense], "image": [image_encoder], }, route_mappings={ (None, "text"): "text", # Any task with text goes to text encoder (None, ("text", "image")): "image", # Any task with text-image together goes to image encoder }, ) model = SentenceTransformer(modules=[router, pooling]) # Modality is automatically inferred text_embedding = model.encode("A photo of a cat") multimodal_embedding = model.encode({"text": "A photo of a <image>", "image": Image.open("cat.jpg")}) # Compute the similarity; it'll be poor as the model hasn't yet been trained similarity = model.similarity(text_embedding, multimodal_embedding)
Hybrid Asymmetric + Multimodal Example:
from sentence_transformers import SentenceTransformer from sentence_transformers.sentence_transformer.modules import Router # Different encoders for query text, document text, and images router = Router( sub_modules={ "query_text": [query_text_modules], "doc_text": [document_text_modules], "image": [image_modules], }, route_mappings={ ("query", "text"): "query_text", # Query text uses efficient encoder ("document", "text"): "doc_text", # Document text uses powerful encoder (None, ("text", "image")): "image", # Any text-image together goes to image encoder }, ) model = SentenceTransformer(modules=[router]) # Explicit task + automatic modality inference query_embedding = model.encode_query("Find images of cats") doc_embedding = model.encode_document("Article about cats") multimodal_embedding = model.encode({"text": "A photo of a cat", "image": Image.open("cat.jpg")})
Note
When training models with the
Routermodule, you must use therouter_mappingargument in theSentenceTransformerTrainingArgumentsorSparseEncoderTrainingArgumentsto map the training dataset columns to the correct route (“query” or “document”). For example, if your training dataset(s) have["question", "positive", "negative"]columns, then you can use the following mapping:args = SparseEncoderTrainingArguments( ..., router_mapping={ "question": "query", "positive": "document", "negative": "document", } )
Additionally, it is common to use a different learning rate for the different routes. For this, you should use the
learning_rate_mappingargument in theSentenceTransformerTrainingArgumentsorSparseEncoderTrainingArgumentsto map parameter patterns to their learning rates. For example, if you want to use a learning rate of1e-3for an SparseStaticEmbedding module and2e-5for the rest of the model, you can do this:args = SparseEncoderTrainingArguments( ..., learning_rate=2e-5, learning_rate_mapping={ r"SparseStaticEmbedding\.*": 1e-3, } )
- Parameters:
sub_modules – Mapping of route keys to lists of modules. Each key corresponds to a specific route name (e.g., “text_query”, “text_document”, “image”, “multimodal”). Each route contains a list of modules that will be applied sequentially when that route is selected.
default_route – The default route to use if no task type or modality is specified. If None, an exception will be thrown if no task type is specified. If
allow_empty_keyis True, the first key in sub_modules will be used as the default route. Defaults to None.allow_empty_key – If True, allows the default route to be set to the first key in sub_modules if
default_routeis None. Defaults to True.route_mappings –
Optional dictionary mapping (task, modality) tuples to route keys in sub_modules. This enables sophisticated routing logic based on combinations of task and modality:
Use
Noneas a wildcard for either task or modality to create catch-all rulesModality can be a string (e.g.,
"text","image") or tuple (e.g.,("text", "image"))Routes are resolved with a priority order (see Route Resolution Priority above)
All mapped routes must exist in
sub_modules(validated at initialization)
Example mappings:
{ # Exact matches (highest priority) ("query", "text"): "efficient_text_encoder", ("document", "text"): "powerful_text_encoder", # Task with any modality ("query", None): "query_encoder", # All query tasks # Any task with specific modality (None, "image"): "image_encoder", # All image inputs (None, ("text", "image")): "multimodal_encoder", # Multimodal inputs # Catch-all (lowest priority) (None, None): "default_encoder", }
If not provided, the router will attempt direct lookup using the task or modality as the route key in
sub_modules, then fall back todefault_route.
- classmethod for_query_document(query_modules: list[Module], document_modules: list[Module], default_route: str | None = 'document', allow_empty_key: bool = True) Self[source]
Creates a Router model specifically for query and document modules, allowing convenient usage via model.encode_query and model.encode_document.
- Parameters:
query_modules – List of modules to be applied for the “query” task type.
document_modules – List of modules to be applied for the “document” task type.
default_route – The default route to use if no task type is specified. If None, an exception will be thrown if no task type is specified. If
allow_empty_keyis True, the first key in sub_modules will be used as the default route. Defaults to “document”.allow_empty_key – If True, allows the default route to be set to the first key in sub_modules if
default_routeis None. Defaults to True.
- Returns:
An instance of the Router model with the specified query and document modules.
- Return type:
Base Modules
- class sentence_transformers.base.modules.Module(*args, **kwargs)[source]
Base class for all modules in the Sentence Transformers library.
This class provides a common interface for all modules, including methods for loading and saving the module’s configuration and weights. It also provides a method for performing the forward pass of the module.
Two abstract methods are defined in this class, which must be implemented by subclasses:
sentence_transformers.base.modules.Module.forward(): The forward pass of the module.sentence_transformers.base.modules.Module.save(): Save the module to disk.
Optionally, you may also have to override:
sentence_transformers.base.modules.Module.load(): Load the module from disk.
To assist with loading and saving the module, several utility methods are provided:
sentence_transformers.base.modules.Module.load_config(): Load the module’s configuration from a JSON file.sentence_transformers.base.modules.Module.load_file_path(): Load a file from the module’s directory, regardless of whether the module is saved locally or on Hugging Face.sentence_transformers.base.modules.Module.load_dir_path(): Load a directory from the module’s directory, regardless of whether the module is saved locally or on Hugging Face.sentence_transformers.base.modules.Module.load_torch_weights(): Load the PyTorch weights of the module, regardless of whether the module is saved locally or on Hugging Face.sentence_transformers.base.modules.Module.save_config(): Save the module’s configuration to a JSON file.sentence_transformers.base.modules.Module.save_torch_weights(): Save the PyTorch weights of the module.sentence_transformers.base.modules.Module.get_config_dict(): Get the module’s configuration as a dictionary.
And several class variables are defined to assist with loading and saving the module:
sentence_transformers.base.modules.Module.config_file_name: The name of the configuration file used to save the module’s configuration.sentence_transformers.base.modules.Module.config_keys: A list of keys used to save the module’s configuration.sentence_transformers.base.modules.Module.save_in_root: Whether to save the module’s configuration in the root directory of the model or in a subdirectory named after the module.
- config_file_name: str = 'config.json'
The name of the configuration file used to save the module’s configuration. This file is used to initialize the module when loading it from a pre-trained model.
- config_keys: list[str] = []
A list of keys used to save the module’s configuration. These keys are used to save the module’s configuration when saving the model to disk.
- abstract forward(features: dict[str, Tensor | Any], **kwargs) dict[str, Tensor | Any][source]
Forward pass of the module. This method should be overridden by subclasses to implement the specific behavior of the module.
The forward method takes a dictionary of features as input and returns a dictionary of features as output. The keys in the
featuresdictionary depend on the position of the module in the model pipeline, as thefeaturesdictionary is passed from one module to the next. Common keys in thefeaturesdictionary are:input_ids: The input IDs of the tokens in the input text.attention_mask: The attention mask for the input tokens.token_type_ids: The token type IDs for the input tokens.token_embeddings: The token embeddings for the input tokens.sentence_embedding: The sentence embedding for the input text, i.e. pooled token embeddings.
Optionally, the
forwardmethod can accept additional keyword arguments (**kwargs) that can be used to pass additional information frommodel.encodeto this module.- Parameters:
features (dict[str, torch.Tensor | Any]) – A dictionary of features to be processed by the module.
**kwargs – Additional keyword arguments that can be used to pass additional information from
model.encode.
- Returns:
A dictionary of features after processing by the module.
- Return type:
dict[str, torch.Tensor | Any]
- get_config_dict() dict[str, Any][source]
Returns a dictionary of the configuration parameters of the module.
These parameters are used to save the module’s configuration when saving the model to disk, and again used to initialize the module when loading it from a pre-trained model. The keys used in the dictionary are defined in the
config_keysclass variable.- Returns:
A dictionary of the configuration parameters of the module.
- Return type:
dict[str, Any]
- classmethod load(model_name_or_path: str, subfolder: str = '', token: bool | str | None = None, cache_folder: str | None = None, revision: str | None = None, local_files_only: bool = False, **kwargs) Self[source]
Load this module from a model checkpoint. The checkpoint can be either a local directory or a model id on Hugging Face.
- Parameters:
model_name_or_path (str) – The path to the model directory or the name of the model on Hugging Face.
subfolder (str, optional) – The subfolder within the model directory to load from, e.g.
"1_Pooling". Defaults to"".token (bool | str | None, optional) – The token to use for authentication when loading from Hugging Face. If None, tries to use a token saved using
huggingface-cli loginor theHF_TOKENenvironment variable. Defaults to None.cache_folder (str | None, optional) – The folder to use for caching the model files. If None, uses the default cache folder for Hugging Face,
~/.cache/huggingface. Defaults to None.revision (str | None, optional) – The revision of the model to load. If None, uses the latest revision. Defaults to None.
local_files_only (bool, optional) – Whether to only load local files. Defaults to False.
**kwargs – Additional module-specific arguments used in an overridden
loadmethod, such astrust_remote_code,model_kwargs,processor_kwargs,config_kwargs,backend, etc.
- Returns:
The loaded module.
- Return type:
Self
- classmethod load_config(model_name_or_path: str, subfolder: str = '', config_filename: str | None = None, token: bool | str | None = None, cache_folder: str | None = None, revision: str | None = None, local_files_only: bool = False) dict[str, Any][source]
Load the configuration of the module from a model checkpoint. The checkpoint can be either a local directory or a model id on Hugging Face. The configuration is loaded from a JSON file, which contains the parameters used to initialize the module.
- Parameters:
model_name_or_path (str) – The path to the model directory or the name of the model on Hugging Face.
subfolder (str, optional) – The subfolder within the model directory to load from, e.g.
"1_Pooling". Defaults to"".config_filename (str | None, optional) – The name of the configuration file to load. If None, uses the default configuration file name defined in the
config_file_nameclass variable. Defaults to None.token (bool | str | None, optional) – The token to use for authentication when loading from Hugging Face. If None, tries to use a token saved using
huggingface-cli loginor theHF_TOKENenvironment variable. Defaults to None.cache_folder (str | None, optional) – The folder to use for caching the model files. If None, uses the default cache folder for Hugging Face,
~/.cache/huggingface. Defaults to None.revision (str | None, optional) – The revision of the model to load. If None, uses the latest revision. Defaults to None.
local_files_only (bool, optional) – Whether to only load local files. Defaults to False.
- Returns:
A dictionary of the configuration parameters of the module.
- Return type:
dict[str, Any]
- static load_dir_path(model_name_or_path: str, subfolder: str = '', token: bool | str | None = None, cache_folder: str | None = None, revision: str | None = None, local_files_only: bool = False) str | None[source]
A utility function to load a directory from a model checkpoint. The checkpoint can be either a local directory or a model id on Hugging Face.
- Parameters:
model_name_or_path (str) – The path to the model directory or the name of the model on Hugging Face.
subfolder (str, optional) – The subfolder within the model directory to load from, e.g.
"1_Pooling". Defaults to"".token (bool | str | None, optional) – The token to use for authentication when loading from Hugging Face. If None, tries to use a token saved using
huggingface-cli loginor theHF_TOKENenvironment variable. Defaults to None.cache_folder (str | None, optional) – The folder to use for caching the model files. If None, uses the default cache folder for Hugging Face,
~/.cache/huggingface. Defaults to None.revision (str | None, optional) – The revision of the model to load. If None, uses the latest revision. Defaults to None.
local_files_only (bool, optional) – Whether to only load local files. Defaults to False.
- Returns:
The path to the loaded directory.
- Return type:
str | None
- static load_file_path(model_name_or_path: str, filename: str, subfolder: str = '', token: bool | str | None = None, cache_folder: str | None = None, revision: str | None = None, local_files_only: bool = False) str | None[source]
A utility function to load a file from a model checkpoint. The checkpoint can be either a local directory or a model id on Hugging Face. The file is loaded from the specified subfolder within the model directory.
- Parameters:
model_name_or_path (str) – The path to the model directory or the name of the model on Hugging Face.
filename (str) – The name of the file to load.
subfolder (str, optional) – The subfolder within the model directory to load from, e.g.
"1_Pooling". Defaults to"".token (bool | str | None, optional) – The token to use for authentication when loading from Hugging Face. If None, tries to use a token saved using
huggingface-cli loginor theHF_TOKENenvironment variable. Defaults to None.cache_folder (str | None, optional) – The folder to use for caching the model files. If None, uses the default cache folder for Hugging Face,
~/.cache/huggingface. Defaults to None.revision (str | None, optional) – The revision of the model to load. If None, uses the latest revision. Defaults to None.
local_files_only (bool, optional) – Whether to only load local files. Defaults to False.
- Returns:
The path to the loaded file, or None if the file was not found.
- Return type:
str | None
- classmethod load_torch_weights(model_name_or_path: str, subfolder: str = '', token: bool | str | None = None, cache_folder: str | None = None, revision: str | None = None, local_files_only: bool = False, model: Self | None = None)[source]
A utility function to load the PyTorch weights of a model from a checkpoint. The checkpoint can be either a local directory or a model id on Hugging Face. The weights are loaded from either a
model.safetensorsfile or apytorch_model.binfile, depending on which one is available. This method either loads the weights into the model or returns the weights as a state dictionary.- Parameters:
model_name_or_path (str) – The path to the model directory or the name of the model on Hugging Face.
subfolder (str, optional) – The subfolder within the model directory to load from, e.g.
"2_Dense". Defaults to"".token (bool | str | None, optional) – The token to use for authentication when loading from Hugging Face. If None, tries to use a token saved using
huggingface-cli loginor theHF_TOKENenvironment variable. Defaults to None.cache_folder (str | None, optional) – The folder to use for caching the model files. If None, uses the default cache folder for Hugging Face,
~/.cache/huggingface. Defaults to None.revision (str | None, optional) – The revision of the model to load. If None, uses the latest revision. Defaults to None.
local_files_only (bool, optional) – Whether to only load local files. Defaults to False.
model (Self | None, optional) – The model to load the weights into. If None, returns the weights as a state dictionary. Defaults to None.
- Raises:
ValueError – If neither a
model.safetensorsfile nor apytorch_model.binfile is found in the model checkpoint in thesubfolder.- Returns:
- The model with the loaded weights or the weights as a state dictionary,
depending on the value of the
modelargument.
- Return type:
Self | dict[str, torch.Tensor]
- abstract save(output_path: str, *args, safe_serialization: bool = True, **kwargs) None[source]
Save the module to disk. This method should be overridden by subclasses to implement the specific behavior of the module.
- Parameters:
output_path (str) – The path to the directory where the module should be saved.
*args – Additional arguments that can be used to pass additional information to the save method.
safe_serialization (bool, optional) – Whether to use the safetensors format for saving the model weights. Defaults to True.
**kwargs – Additional keyword arguments that can be used to pass additional information to the save method.
- save_config(output_path: str, filename: str | None = None) None[source]
Save the configuration of the module to a JSON file.
- Parameters:
output_path (str) – The path to the directory where the configuration file should be saved.
filename (str | None, optional) – The name of the configuration file. If None, uses the default configuration file name defined in the
config_file_nameclass variable. Defaults to None.
- Returns:
None
- save_in_root: bool = False
Whether to save the module’s configuration in the root directory of the model or in a subdirectory named after the module.
- save_torch_weights(output_path: str, safe_serialization: bool = True) None[source]
Save the PyTorch weights of the module to disk.
- Parameters:
output_path (str) – The path to the directory where the weights should be saved.
safe_serialization (bool, optional) – Whether to use the safetensors format for saving the model weights. Defaults to True.
- Returns:
None
- class sentence_transformers.base.modules.InputModule(*args, **kwargs)[source]
Subclass of
sentence_transformers.base.modules.Module, base class for all input modules in the Sentence Transformers library, i.e. modules that are used to process inputs and optionally also perform processing in the forward pass.This class provides a common interface for all input modules, including methods for loading and saving the module’s configuration and weights, as well as input processing. It also provides a method for performing the forward pass of the module.
Two abstract methods are inherited from
Moduleand must be implemented by subclasses:sentence_transformers.base.modules.Module.forward(): The forward pass of the module.sentence_transformers.base.modules.Module.save(): Save the module to disk.
Additionally, subclasses should override:
sentence_transformers.base.modules.InputModule.preprocess(): Preprocess the inputs and return a dictionary of preprocessed features.
Optionally, you may also have to override:
sentence_transformers.base.modules.InputModule.modalities: The list of supported input modalities. Defaults to["text"]. Override this to advertise support for non-text modalities (e.g.["text", "image"]).sentence_transformers.base.modules.Module.load(): Load the module from disk.
To assist with loading and saving the module, several utility methods are provided:
sentence_transformers.base.modules.Module.load_config(): Load the module’s configuration from a JSON file.sentence_transformers.base.modules.Module.load_file_path(): Load a file from the module’s directory, regardless of whether the module is saved locally or on Hugging Face.sentence_transformers.base.modules.Module.load_dir_path(): Load a directory from the module’s directory, regardless of whether the module is saved locally or on Hugging Face.sentence_transformers.base.modules.Module.load_torch_weights(): Load the PyTorch weights of the module, regardless of whether the module is saved locally or on Hugging Face.sentence_transformers.base.modules.Module.save_config(): Save the module’s configuration to a JSON file.sentence_transformers.base.modules.Module.save_torch_weights(): Save the PyTorch weights of the module.sentence_transformers.base.modules.InputModule.save_tokenizer(): Save the tokenizer used by the module.sentence_transformers.base.modules.Module.get_config_dict(): Get the module’s configuration as a dictionary.
And several class variables are defined to assist with loading and saving the module:
sentence_transformers.base.modules.Module.config_file_name: The name of the configuration file used to save the module’s configuration.sentence_transformers.base.modules.Module.config_keys: A list of keys used to save the module’s configuration.sentence_transformers.base.modules.InputModule.save_in_root: Whether to save the module’s configuration in the root directory of the model or in a subdirectory named after the module.sentence_transformers.base.modules.InputModule.tokenizer: The tokenizer used by the module.
- property modalities: list[Literal['text', 'image', 'audio', 'video', 'message'] | tuple[Literal['text', 'image', 'audio', 'video'], ...]]
The list of supported input modalities. Defaults to
["text"].
- preprocess(inputs: list[str | Image | ndarray | Tensor | AudioDict | None | VideoDict | MessageDict | list[MessageDict] | dict[Literal['text', 'image', 'audio', 'video'], str | Image | ndarray | Tensor | AudioDict | None | VideoDict] | tuple[str | Image | ndarray | Tensor | AudioDict | None | VideoDict | dict[Literal['text', 'image', 'audio', 'video'], str | Image | ndarray | Tensor | AudioDict | None | VideoDict], str | Image | ndarray | Tensor | AudioDict | None | VideoDict | dict[Literal['text', 'image', 'audio', 'video'], str | Image | ndarray | Tensor | AudioDict | None | VideoDict]] | list[str | Image | ndarray | Tensor | AudioDict | None | VideoDict | dict[Literal['text', 'image', 'audio', 'video'], str | Image | ndarray | Tensor | AudioDict | None | VideoDict]]], prompt: str | None = None, **kwargs) dict[str, Tensor | Any][source]
Preprocesses the input texts and returns a dictionary of preprocessed features.
- Parameters:
inputs (list[SingleInput | PairInput]) – List of inputs to preprocess.
prompt (str | None) – Optional prompt to prepend to text inputs.
**kwargs – Additional keyword arguments for preprocessing, e.g.
task.
- Returns:
- Dictionary containing preprocessed features, e.g.
{"input_ids": ..., "attention_mask": ...}, depending on what keys the module’s forward method expects.
- Return type:
dict[str, torch.Tensor | Any]
- save_in_root: bool = True
Whether to save the module’s configuration in the root directory of the model or in a subdirectory named after the module.
- save_tokenizer(output_path: str, **kwargs) None[source]
Saves the tokenizer to the specified output path.
- Parameters:
output_path (str) – Path to save the tokenizer.
**kwargs – Additional keyword arguments for saving the tokenizer.
- Returns:
None
- tokenize(texts: list[str], **kwargs) dict[str, Tensor | Any][source]
Deprecated since version `tokenize`: is deprecated. Use preprocess instead.
Tokenizes the input texts and returns a dictionary of tokenized features.
- Parameters:
texts (list[str]) – List of input texts to tokenize.
**kwargs – Additional keyword arguments for tokenization, e.g.
task.
- Returns:
- Dictionary containing tokenized features, e.g.
{"input_ids": ..., "attention_mask": ...}
- Return type:
dict[str, torch.Tensor | Any]
- tokenizer: PreTrainedTokenizerBase | Tokenizer
The tokenizer used for tokenizing the input texts. It can be either a
transformers.PreTrainedTokenizerBasesubclass or a Tokenizer from thetokenizerslibrary.