Datasets

sentence_transformers.datasets contains classes to organize your training input examples.

ParallelSentencesDataset

ParallelSentencesDataset is used for multilingual training. For details, see multilingual training.

SentenceLabelDataset

SentenceLabelDataset can be used if you have labeled sentences and want to train with triplet loss.

DenoisingAutoEncoderDataset

DenoisingAutoEncoderDataset is used for unsupervised training with the TSDAE method.

NoDuplicatesDataLoader

NoDuplicatesDataLoadercan be used together with MultipleNegativeRankingLoss to ensure that no duplicates are within the same batch.