Open In App

Sentiment Analysis of Hindi Text – Python

Last Updated : 04 Mar, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

Sentiment Analysis for Indic Language:  

This article exhibits how to use the library VADER  for doing the sentiment analysis of the Indic Language’Hindi’.  

Sentiment analysis is a metric that conveys how positive or negative or neutral the text or data is. It is performed on textual data to help businesses monitor brand and product sentiment in customer feedback, and understand customer needs. It is a time-efficient, cost-friendly solution to analyse huge data. Python avails great support for doing sentiment analysis of data. Few of the libraries available for this purpose are NLTK, TextBlob and VADER.

For doing sentiment analysis of Indic languages such as Hindi we need to do the following tasks.

  1. Read the text file which is in Hindi.
  2. Translate the sentences in Hindi to the sentences in English as the python libraries do support text-analysis in the English language. (Even if you give the Hindi sentences to such functions the ‘compound score’ which is metric of the sentiment if the sentence is calculated in a wrong manner. So before computing this metric conversion to the equivalent sentence in the English language is appropriate.)  The Google Translator helps in this task.
  3. Do sentiment analysis of the translated text using any of the libraries mentioned above.

The following steps need to be done.

Step 1: Import the necessary libraries/packages.

Python3




# codecs provides access to the internal Python codec registry
import codecs
 
# This is to translate the text from Hindi to English
from deep_translator import GoogleTranslator
 
# This is to analyse the sentiment of text
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer


Step 2: Read the file data.  The ‘codecs’ library provides access to the internal Python codec registry.  Most standard codecs are text encodings, which encode text to bytes. Custom codecs may encode and decode between arbitrary types

Python3




# Read the hindi text into 'sentences'
with codecs.open('SampleHindiText.txt', encoding='utf-8') as f:
    sentences = f.readlines()


Step 3: Translate the sentences read into English so that the VADER library can process the translated text for sentiment analysis. The polarity_scores() returns the sentiment dictionary of the text which includes the ‘’compound’’ score that tells about the sentiment of the sentence as given below.

  • positive sentiment: compound score >= 0.05
  • Neutral sentiment : compound score > -0.05 and compound score < 0.05
  • Negative sentiment : compound score <= -0.05

Python3




for sentence in sentences:
    translated_text = GoogleTranslator(source='auto', target='en').translate(sentence)
    #print(translated_text)
    analyzer = SentimentIntensityAnalyzer()
    sentiment_dict = analyzer.polarity_scores(translated_text)
     
    print("\nTranslated Sentence=",translated_text, "\nDictionary=",sentiment_dict)
    if sentiment_dict['compound'] >= 0.05 :
            print("It is a Positive Sentence")
              
    elif sentiment_dict['compound'] <= - 0.05 :
            print("It is a Negative Sentence")     
    else :   
           print("It is a Neutral Sentence")


• The source file ‘SampleHindiText.txt’ is as given below.

गोवा की यात्रा बहुत अच्छी रही।
समुद्र तट बहुत गर्म थे।
मुझे समुद्र तट पर खेलने में बहुत मजा आया।
मेरी बेटी बहुत गुस्से में थी।

• The output of the code is shown below.

Translated Sentence= The trip to Goa was great. 
Dictionary= {'neg': 0.0, 'neu': 0.549, 'pos': 0.451, 'compound': 0.6249}
It is a Positive Sentence

Translated Sentence= The beaches were very hot. 
Dictionary= {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
It is a Neutral Sentence

Translated Sentence= I really enjoyed playing on the beach. 
Dictionary= {'neg': 0.0, 'neu': 0.469, 'pos': 0.531, 'compound': 0.688}
It is a Positive Sentence

Translated Sentence= My daughter was very angry. 
Dictionary= {'neg': 0.473, 'neu': 0.527, 'pos': 0.0, 'compound': -0.5563}
It is a Negative Sentence


Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads