Prerequisite: Create a Virtual Machine and setup API on Google Cloud
In this article, we will discuss how to use Google’s Translation and Natural Language Processing features using Google Cloud. Before reading this article, you should have an idea of how to create an instance in a Virtual Machine and how to set up an API (refer this).
Translation API –
- Google Translator API works in the same way as in here.
- First enable the Cloud Translation API and download the .json file containing the credential informations as instructed here.
You need to download the following packages –
pip install google.cloud pip install google.cloud.translate
Save the credetials.json
file in the same folder as the .py
file with the Python code. We need to save the path of credentials.json’ (C:\Users\…) as ‘GOOGLE_APPLICATION_CREDENTIALS’ which has been done in line-5 of the following code.
import os import io from google.cloud import translate os.environ[ 'GOOGLE_APPLICATION_CREDENTIALS' ] = os.path.join(os.curdir, 'credentials.json' ) input_file = "filename_input.txt" output_file = "filename_output.txt" with io. open (input_file, "r" , encoding = "utf-8" ) as inp: data = inp.read() # The encoding needs to be utf-8 (Unicode) # because all languages are supported by # Google Cloud and ASCII supports only English. translate_client = translate.Client() translated = list () translated.append(translate_client.translate( data, target_language = 'en' )[ 'translatedText' ]) open ( file = output_file, mode = 'w' , encoding = 'utf-8' , errors = 'ignore' ).write( '\n' .join(translated)) |
The input and output .txt files should be in the same folder as the Python file, else the whole path address needs to be provided. The input file should have the source text in any language. The user does not need to specify the language as Google detects that automatically. The target language, however, needs to be provided in the form of ISO 639-1 code (for example, the above code translates the text into English (coded by ‘en’)).
Natural Language Processing –
Enable the Cloud Natural Language API and download the ‘credentials.json’ file as explained here. You need to download the following package –
pip install google.cloud.language
Google’s Natural Language Processing API provides several methods for analyzing text. All of them are valuable aspects of Language Analysis.
Sentiment Analysis:
It analyses the text and understands the emotional opinion of the text. The output of Sentiment Analysis is a score within a range of -1 to 1, where -1 signifies 100% negative emotion, 1 signifies 100% positive emotion and 0 signifies neutral. It also outputs a magnitude with a range from 0 to infinity indicating the overall strength of emotion.
import os import io from google.cloud import language from google.cloud.language import enums from google.cloud.language import types os.environ[ 'GOOGLE_APPLICATION_CREDENTIALS' ] = os.path.join(os.curdir, 'credentials.json' ) client = language.LanguageServiceClient() input_file = "filename_input.txt" with io. open (input_file, "r" ) as inp: docu = inp.read() text = types.Document(content = docu, type = enums.Document. Type .PLAIN_TEXT) annotation = client.analyze_sentiment(document = text) score = annotation.document_sentiment.score magnitude = annotation.document_sentiment.magnitude for index, sentence in enumerate (annotation.sentences): sentence_sentiment = sentence.sentiment.score print ( 'Sentence #{} Sentiment score: {}' . format ( index + 1 , sentence_sentiment)) print ( 'Score: {}, Magnitude: {}' . format (score, magnitude)) |
The text should be present in the file titled filename_input.txt
. The above code will analyze and publish the sentiment of the text line by line and will also provide the overall sentiment.
Clearly Positive -> Score: 0.8, Magnitude: 3.0 Clearly Negative -> Score: -0.6, Magnitude: 4.0 Neutral -> Score: 0.1, Magnitude: 0.0 Mixed -> Score: 0.0, Magnitude: 4.0
This is the approximate nature of emotions attached to the texts via Sentiment Analysis.
Entity Analysis:
Entity Analysis provides information about entities in the text, which generally refer to named “things” such as famous individuals, landmarks, common objects, etc.
import os import io from google.cloud import language from google.cloud.language import enums from google.cloud.language import types os.environ[ 'GOOGLE_APPLICATION_CREDENTIALS' ] = os.path.join(os.curdir, 'credentials.json' ) client = language.LanguageServiceClient() input_file = "filename_input.txt" with io. open (input_file, "r" ) as inp: docu = inp.read() text = types.Document(content = docu, type = enums.Document. Type .PLAIN_TEXT) ent = client.analyze_entities(document = text) entity = ent.entities for e in entity: print (e.name, e.metadata, e, type , e.salience) |
The above code will extract all entities from the above text, name its type, salience (i.e. the prominence of the entity) and its metadata (present mostly for proper nouns, along with the Wikipedia link for that entity)
Syntax Analysis:
Syntax Analysis breaks up the given text into tokens (by default a series of words) and provides linguistic information about those tokens.
import os import io from google.cloud import language from google.cloud.language import enums from google.cloud.language import types os.environ[ 'GOOGLE_APPLICATION_CREDENTIALS' ] = os.path.join(os.curdir, 'credentials.json' ) client = language.LanguageServiceClient() input_file = "filename_input.txt" with io. open (input_file, "r" ) as inp: docu = inp.read() text = types.Document(content = docu, type = enums.Document. Type .PLAIN_TEXT) tokens = client.analyze_syntax(text).tokens for token in tokens: speech_tag = enums.PartOfSpeech.Tag(token.part_of_speech.tag) print (u '{}: {}' . format (speech_tag.name, token.text.content)) |
The above code provides a list of all words and its Syntax, whether it is a noun, verb, pronoun, punctuation etc. For further information, visit Google Natural Language API documentation here.
Thus Google Cloud APIs provides high functionality services which are easy to use, portable, short and clear.
Note:
Sometimes, the above programs will result in an error “ImportError: Cannot import name ‘cygrpc'” and problem arises when we try to install it using
pip install cygrpc or sudo apt-get install cygrpc
Instead use the following command :
python -m pip install grpcio --ignore-installed