Datasets¶
sentence_transformers.datasets
contains classes to organize your training input examples.
ParallelSentencesDataset¶
ParallelSentencesDataset
is used for multilingual training. For details, see multilingual training.
SentenceLabelDataset¶
SentenceLabelDataset
can be used if you have labeled sentences and want to train with triplet loss.
DenoisingAutoEncoderDataset¶
DenoisingAutoEncoderDataset
is used for unsupervised training with the TSDAE method.
NoDuplicatesDataLoader¶
NoDuplicatesDataLoader
can be used together with MultipleNegativeRankingLoss to ensure that no duplicates are within the same batch.