SentenceTransformers can be used for various use-cases. In these folders, you find several example scripts that show case how SentenceTransformers can be used
The computing-embeddings folder contains examples how to compute sentence embeddings using SentenceTransformers.
The clustering folder shows how SentenceTransformers can be used for text clustering, i.e., grouping sentences together based on their similarity.
SentenceTransformers also support training and inference of Cross-Encoders. There, two sentences are presented simultaneously to the transformer network and a score (0…1) is derived indicating the similarity or a label.
Parallel Sentence Mining¶
The parallel-sentence-mining folder contains examples how parallel (translated) sentences can be found in two corpora of different language. For example, you take the English and the Spanish Wikipedia and the script finds and returns all translated English-Spanisch sentence pairs.
The paraphrase-mining folder contains examples to find all paraphrase sentences in a large set of sentences. The example can be used to find e.g. duplicate questions or duplicate sentences in a set of Millions of questions / sentences.
The semantic-search folder shows examples for semantic search: Given a sentence, find in a large collection semantically similar sentences.
Retrieve & Rerank¶
The retrieve_rerank folder shows how to combine a bi-encoder for semantic search retrieval and a more powerfull re-ranking stage with a cross-encoder.
The image-search folder shows how to use the image&text-models, which can map images and text to the same vector space. This allows for an image search given a user query.