Skip to content
Related Articles

Related Articles

Improve Article
Word Embedding using Universal Sentence Encoder in Python
  • Last Updated : 26 Mar, 2021

Unlike the word embedding techniques in which you represent word into vectors, in Sentence Embeddings entire sentence or text along with its semantics information is mapped into vectors of real numbers. This technique makes it possible to understand and process useful information of an entire text, which can then be used in understanding the context or meaning of the sentence in a better way.

In this article, you will learn about how to create vectors for a complete sentence using Universal Sentence Encoder.

For example:

Let’s consider two sentences: –

  1. How old are you?
  2. What is your age?

The above two sentences are similar in meaning i.e. we are trying to ask the person’s age. In the above two sentences, individual words and their vectors will not give a good insight into what a complete sentence is trying to convey, nor they will be able to classify if these two sentences are similar or not. So in such scenarios Sentence embeddings perform better than word embeddings.



There are various Sentence embeddings techniques like Doc2Vec, SentenceBERT, Universal Sentence Encoder, etc.

Universal Sentence Encoder

Universal Sentence Encoder encodes entire sentence or text into vectors of real numbers that can be used for clustering, sentence similarity, text classification, and other Natural language processing (NLP) tasks. The pre-trained model is available here under Apache-2.0 License. The pre-trained model is trained on greater than word length text, sentences, phrases, paragraphs, etc using a deep averaging network (DAN) encoder.

Implementation of sentence embeddings using Universal Sentence Encoder: 

Run these command before running the code in your terminal to install the necessary libraries.

pip install “tensorflow>=2.0.0”

pip install –upgrade tensorflow-hub

Program:

Python3






# import necessary libraries
import tensorflow_hub as hub
  
# Load pre-trained universal sentence encoder model
  
# Sentences for which you want to create embeddings,
# passed as an array in embed()
Sentences = [
    "How old are you",
    "What is your age",
    "I love to watch Television",
    "I am wearing a wrist watch"
]
embeddings = embed(Sentences)
  
# Printing embeddings of each sentence
print(embeddings)
  
# To print each embeddings along with its corresponding 
# sentence below code can be used.
for i in range(len(Sentences)):
    print(Sentences[i])
    print(embeddings[i])

Output: 

tf.Tensor(

[[-0.06045125 -0.00204541  0.02656925 …  0.00764413 -0.02669661

   0.05110302]

 [-0.08415682 -0.08687923  0.03446117 … -0.01439389 -0.04546221

   0.03639965]

 [ 0.0816019  -0.01570276 -0.05659245 … -0.07133699  0.11040762

  -0.0071095 ]

 [-0.00369539  0.03064634 -0.05556112 …  0.01751423  0.0316496

  -0.05139377]], shape=(4, 512), dtype=float32)

Explanation:

The above output represents input sentences into their corresponding vectors using the Universal Sentence encoder.

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course

My Personal Notes arrow_drop_up
Recommended Articles
Page :