Sentiment Analysis for Indic Language:
This article exhibits how to use the library VADER for doing the sentiment analysis of the Indic Language’Hindi’.
Sentiment analysis is a metric that conveys how positive or negative or neutral the text or data is. It is performed on textual data to help businesses monitor brand and product sentiment in customer feedback, and understand customer needs. It is a time-efficient, cost-friendly solution to analyse huge data. Python avails great support for doing sentiment analysis of data. Few of the libraries available for this purpose are NLTK, TextBlob and VADER.
For doing sentiment analysis of Indic languages such as Hindi we need to do the following tasks.
- Read the text file which is in Hindi.
- Translate the sentences in Hindi to the sentences in English as the python libraries do support text-analysis in the English language. (Even if you give the Hindi sentences to such functions the ‘compound score’ which is metric of the sentiment if the sentence is calculated in a wrong manner. So before computing this metric conversion to the equivalent sentence in the English language is appropriate.) The Google Translator helps in this task.
- Do sentiment analysis of the translated text using any of the libraries mentioned above.
The following steps need to be done.
Step 1: Import the necessary libraries/packages.
Python3
import codecs
from deep_translator import GoogleTranslator
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
|
Step 2: Read the file data. The ‘codecs’ library provides access to the internal Python codec registry. Most standard codecs are text encodings, which encode text to bytes. Custom codecs may encode and decode between arbitrary types
Python3
with codecs. open ( 'SampleHindiText.txt' , encoding = 'utf-8' ) as f:
sentences = f.readlines()
|
Step 3: Translate the sentences read into English so that the VADER library can process the translated text for sentiment analysis. The polarity_scores() returns the sentiment dictionary of the text which includes the ‘’compound’’ score that tells about the sentiment of the sentence as given below.
- positive sentiment: compound score >= 0.05
- Neutral sentiment : compound score > -0.05 and compound score < 0.05
- Negative sentiment : compound score <= -0.05
Python3
for sentence in sentences:
translated_text = GoogleTranslator(source = 'auto' , target = 'en' ).translate(sentence)
analyzer = SentimentIntensityAnalyzer()
sentiment_dict = analyzer.polarity_scores(translated_text)
print ( "\nTranslated Sentence=" ,translated_text, "\nDictionary=" ,sentiment_dict)
if sentiment_dict[ 'compound' ] > = 0.05 :
print ( "It is a Positive Sentence" )
elif sentiment_dict[ 'compound' ] < = - 0.05 :
print ( "It is a Negative Sentence" )
else :
print ( "It is a Neutral Sentence" )
|
• The source file ‘SampleHindiText.txt’ is as given below.
गोवा की यात्रा बहुत अच्छी रही।
समुद्र तट बहुत गर्म थे।
मुझे समुद्र तट पर खेलने में बहुत मजा आया।
मेरी बेटी बहुत गुस्से में थी।
• The output of the code is shown below.
Translated Sentence= The trip to Goa was great.
Dictionary= {'neg': 0.0, 'neu': 0.549, 'pos': 0.451, 'compound': 0.6249}
It is a Positive Sentence
Translated Sentence= The beaches were very hot.
Dictionary= {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
It is a Neutral Sentence
Translated Sentence= I really enjoyed playing on the beach.
Dictionary= {'neg': 0.0, 'neu': 0.469, 'pos': 0.531, 'compound': 0.688}
It is a Positive Sentence
Translated Sentence= My daughter was very angry.
Dictionary= {'neg': 0.473, 'neu': 0.527, 'pos': 0.0, 'compound': -0.5563}
It is a Negative Sentence
Whether you're preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape,
GeeksforGeeks Courses are your key to success. We provide top-quality content at affordable prices, all geared towards accelerating your growth in a time-bound manner. Join the millions we've already empowered, and we're here to do the same for you. Don't miss out -
check it out now!