Losses

sentence_transformers.losses defines different loss functions that can be used to fine-tune embedding models on training data. The choice of loss function plays a critical role when fine-tuning the model. It determines how well our embedding model will work for the specific downstream task.

Sadly, there is no “one size fits all” loss function. Which loss function is suitable depends on the available training data and on the target task. Consider checking out the Loss Overview to help narrow down your choice of loss function(s).

BatchAllTripletLoss

BatchHardSoftMarginTripletLoss

BatchHardTripletLoss

BatchSemiHardTripletLoss

ContrastiveLoss

OnlineContrastiveLoss

ContrastiveTensionLoss

ContrastiveTensionLossInBatchNegatives

CoSENTLoss

AnglELoss

CosineSimilarityLoss

SBERT Siamese Network Architecture

For each sentence pair, we pass sentence A and sentence B through our network which yields the embeddings u und v. The similarity of these embeddings is computed using cosine similarity and the result is compared to the gold similarity score.

This allows our network to be fine-tuned to recognize the similarity of sentences.

DenoisingAutoEncoderLoss

GISTEmbedLoss

CachedGISTEmbedLoss

MSELoss

MarginMSELoss

MatryoshkaLoss

Matryoshka2dLoss

AdaptiveLayerLoss

MegaBatchMarginLoss

MultipleNegativesRankingLoss

MultipleNegativesRankingLoss is a great loss function if you only have positive pairs, for example, only pairs of similar texts like pairs of paraphrases, pairs of duplicate questions, pairs of (query, response), or pairs of (source_language, target_language).

CachedMultipleNegativesRankingLoss

MultipleNegativesSymmetricRankingLoss

CachedMultipleNegativesSymmetricRankingLoss

SoftmaxLoss

TripletLoss