Hugging Face 🤗¶
The Hugging Face Hub¶
All models on the Hugging Face Hub come with the following:
An automatically generated model card with a description, example code snippets, architecture overview, and more.
Metadata tags that help for discoverability and contain additional information such as a usage license.
An interactive widget you can use to play with the model directly in the browser.
An Inference API that allows you to make inference requests.
Using Hugging Face models¶
Any pre-trained models from the Hub can be loaded with a single line of code:
from sentence_transformers import SentenceTransformer model = SentenceTransformer('model_name')
You can even click
Use in sentence-transformers to get a code snippet that you can copy and paste!
Here is an example that loads the multi-qa-MiniLM-L6-cos-v1 model and uses it to encode sentences and then compute the distance between them for doing semantic search.
from sentence_transformers import SentenceTransformer, util model = SentenceTransformer('multi-qa-MiniLM-L6-cos-v1') query_embedding = model.encode('How big is London') passage_embedding = model.encode(['London has 9,787,426 inhabitants at the 2011 census', 'London is known for its finacial district']) print("Similarity:", util.dot_score(query_embedding, passage_embedding))
Here is another example, this time using the clips/mfaq model for multilingual FAQ retrieval. After embedding the query and the answers, we perform a semantic search to find the most relevant answer.
from sentence_transformers import SentenceTransformer, util question = "<Q>How many models can I host on HuggingFace?" answer_1 = "<A>All plans come with unlimited private models and datasets." answer_2 = "<A>AutoNLP is an automatic way to train and deploy state-of-the-art NLP models, seamlessly integrated with the Hugging Face ecosystem." answer_3 = "<A>Based on how much training data and model variants are created, we send you a compute cost and payment link - as low as $10 per job." model = SentenceTransformer('clips/mfaq') query_embedding = model.encode(question) corpus_embeddings = model.encode([answer_1, answer_2, answer_3]) print(util.semantic_search(query_embedding, corpus_embeddings))