Translation and Natural Language Processing using Google Cloud

Prerequisite: Create a Virtual Machine and setup API on Google Cloud

In this article, we will discuss how to use Google’s Translation and Natural Language Processing features using Google Cloud. Before reading this article, you should have an idea of how to create an instance in a Virtual Machine and how to set up an API (refer this).

Translation API –

  • Google Translator API works in the same way as in here.
  • First enable the Cloud Translation API and download the .json file containing the credential informations as instructed here.

You need to download the following packages –

pip install google.cloud
pip install google.cloud.translate

Save the credetials.json file in the same folder as the .py file with the Python code. We need to save the path of credentials.json’ (C:\Users\…) as ‘GOOGLE_APPLICATION_CREDENTIALS’ which has been done in line-5 of the following code.

filter_none

edit
close

play_arrow

link
brightness_4
code

import os
import io
from google.cloud import translate
  
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] =
     os.path.join(os.curdir, 'credentials.json')
  
input_file = "filename_input.txt"
output_file = "filename_output.txt"
  
with io.open(input_file, "r", encoding ="utf-8") as inp:
     data = inp.read()
  
# The encoding needs to be utf-8 (Unicode) 
# because all languages are supported by
# Google Cloud and ASCII supports only English.
translate_client = translate.Client()
translated = list()
translated.append(translate_client.translate(
 data, target_language ='en')['translatedText'])
  
open(file = output_file, mode ='w', encoding ='utf-8',
        errors ='ignore').write('\n'.join(translated))

chevron_right


The input and output .txt files should be in the same folder as the Python file, else the whole path address needs to be provided. The input file should have the source text in any language. The user does not need to specify the language as Google detects that automatically. The target language, however, needs to be provided in the form of ISO 639-1 code (for example, the above code translates the text into English (coded by ‘en’)).

Natural Langauge Processing –

Enable the Cloud Natural Language API and download the ‘credentials.json’ file as explained here. You need to download the following package –

pip install google.cloud.language

Google’s Natural Language Processing API provides several methods for analyzing text. All of them are valuable aspects of Language Analysis.

Sentiment Analysis:
It analyses the text and understands the emotional opinion of the text. The output of Sentiment Analysis is a score within a range of -1 to 1, where -1 signifies 100% negative emotion, 1 signifies 100% positive emotion and 0 signifies neutral. It also outputs a magnitude with a range from 0 to infinity indicating the overall strength of emotion.

filter_none

edit
close

play_arrow

link
brightness_4
code

import os
import io
from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types
  
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 
    os.path.join(os.curdir, 'credentials.json')
  
client = language.LanguageServiceClient()
input_file = "filename_input.txt"
with io.open(input_file, "r") as inp:
    docu = inp.read()
      
text = types.Document(content = docu, 
   type = enums.Document.Type.PLAIN_TEXT)
  
annotation = client.analyze_sentiment(document = text)
  
score = annotation.document_sentiment.score
magnitude = annotation.document_sentiment.magnitude
  
for index, sentence in enumerate(annotation.sentences):
    sentence_sentiment = sentence.sentiment.score
    print('Sentence #{} Sentiment score: {}'.format(
                     index + 1, sentence_sentiment))
  
print('Score: {}, Magnitude: {}'.format(score, magnitude))

chevron_right


The text should be present in the file titled filename_input.txt. The above code will analyze and publish the sentiment of the text line by line and will also provide the overall sentiment.

Clearly Positive -> Score: 0.8, Magnitude: 3.0
Clearly Negative -> Score: -0.6, Magnitude: 4.0
Neutral -> Score: 0.1, Magnitude: 0.0
Mixed -> Score: 0.0, Magnitude: 4.0

This is the approximate nature of emotions attached to the texts via Sentiment Analysis.
 

Entity Analysis:
Entity Analysis provides information about entities in the text, which generally refer to named “things” such as famous individuals, landmarks, common objects, etc.

filter_none

edit
close

play_arrow

link
brightness_4
code

import os
import io
from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types
  
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 
    os.path.join(os.curdir, 'credentials.json')
  
client = language.LanguageServiceClient()
input_file = "filename_input.txt"
with io.open(input_file, "r") as inp:
    docu = inp.read()
      
text = types.Document(content = docu, 
    type = enums.Document.Type.PLAIN_TEXT)
  
ent = client.analyze_entities(document = text)
  
entity = ent.entities
  
for e in entity:
    print(e.name, e.metadata, e, type, e.salience)

chevron_right


The above code will extract all entities from the above text, name its type, salience (i.e. the prominence of the entity) and its metadata (present mostly for proper nouns, along with the Wikipedia link for that entity)
 
Syntax Analysis:
Syntax Analysis breaks up the given text into tokens (by default a series of words) and provides linguistic information about those tokens.

filter_none

edit
close

play_arrow

link
brightness_4
code

import os
import io
from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types
  
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 
     os.path.join(os.curdir, 'credentials.json')
  
client = language.LanguageServiceClient()
input_file = "filename_input.txt"
with io.open(input_file, "r") as inp:
    docu = inp.read()
      
text = types.Document(content = docu,
    type = enums.Document.Type.PLAIN_TEXT)
  
tokens = client.analyze_syntax(text).tokens
  
for token in tokens:
    speech_tag = enums.PartOfSpeech.Tag(token.part_of_speech.tag)
    print(u'{}: {}'.format(speech_tag.name, token.text.content))

chevron_right


The above code provides a list of all words and its Syntax, whether it is a noun, verb, pronoun, punctuation etc. For further information, visit Google Natural Language API documentation here.

Thus Google Cloud APIs provides high functionality services which are easy to use, portable, short and clear.

Note:
Sometimes, the above programs will result in an error “ImportError: Cannot import name ‘cygrpc'” and problem arises when we try to install it using

pip install cygrpc
or
sudo apt-get install cygrpc

Instead use the following command :

python -m pip install grpcio --ignore-installed


My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.