Open In App

Translation and Natural Language Processing using Google Cloud

Improve
Improve
Like Article
Like
Save
Share
Report

Prerequisite: Create a Virtual Machine and setup API on Google Cloud In this article, we will discuss how to use Google’s Translation and Natural Language Processing features using Google Cloud. Before reading this article, you should have an idea of how to create an instance in a Virtual Machine and how to set up an API (refer this). Translation API –

Google Cloud offers several tools and services for Translation and Natural Language Processing (NLP), including the following:

  1. Cloud Translation API: This API enables you to translate text between languages using pre-trained models. The API supports over 100 languages and can be used to integrate translation capabilities into your applications.
  2. Cloud Speech-to-Text API: This API converts spoken words into text using machine learning models. The API supports multiple languages and can be used to transcribe audio and video recordings.
  3. Cloud Text-to-Speech API: This API converts text into spoken words using machine learning models. The API supports multiple languages and can be used to generate audio for your applications.
  4. Cloud Natural Language API: This API provides pre-trained models for analyzing and understanding text, including sentiment analysis, entity recognition, and syntax analysis. The API supports several languages and can be used to extract insights from unstructured text data.
  5. AutoML Natural Language: This service enables you to train custom NLP models using your own data. You can train models for tasks such as sentiment analysis, entity recognition, and classification, without needing to have expertise in machine learning.

Using these tools and services, you can build powerful applications that can analyze and understand natural language text and speech. Whether you need to translate text, transcribe audio recordings, or analyze text data, Google Cloud offers a range of tools to help you get started.

Advantages of using Google Cloud for Translation and Natural Language Processing include:

  1. High accuracy: Google Cloud’s machine learning models have been trained on massive amounts of data, resulting in high accuracy and performance.
  2. Scalability: Google Cloud offers scalable and flexible solutions that can handle large volumes of data and traffic, making it suitable for businesses of all sizes.
  3. Ease of use: Google Cloud provides user-friendly APIs and services that can be easily integrated into your applications, without requiring extensive technical expertise.
  4. Customization: With AutoML Natural Language, you can train custom models to suit your specific needs, providing greater accuracy and customization.
  5. Security: Google Cloud’s services are secure and comply with industry-standard security protocols, ensuring the protection of your data.

Disadvantages of using Google Cloud for Translation and Natural Language Processing include:

  1. Cost: Some services can be expensive, especially if you have high volumes of data or traffic.
  2. Complexity: Some services may require technical expertise to set up and integrate with your applications.
  3. Dependency: Using cloud services means that you are dependent on the reliability and uptime of the service provider, and any downtime or outages could impact your applications.
  4. Data privacy: As with any cloud-based service, there may be concerns over data privacy and security, especially if you are working with sensitive data.
  • Google Translator API works in the same way as in here.
  • First enable the Cloud Translation API and download the .json file containing the credential informations as instructed here.

You need to download the following packages –

pip install google.cloud
pip install google.cloud.translate

Save the credentials.json file in the same folder as the .py file with the Python code. We need to save the path of credentials.json’ (C:\Users\…) as ‘GOOGLE_APPLICATION_CREDENTIALS’ which has been done in line-5 of the following code. 

Python3




import os
import io
from google.cloud import translate
 
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] =
     os.path.join(os.curdir, 'credentials.json')
 
input_file = "filename_input.txt"
output_file = "filename_output.txt"
 
with io.open(input_file, "r", encoding ="utf-8") as inp:
     data = inp.read()
 
# The encoding needs to be utf-8 (Unicode)
# because all languages are supported by
# Google Cloud and ASCII supports only English.
translate_client = translate.Client()
translated = list()
translated.append(translate_client.translate(
 data, target_language ='en')['translatedText'])
 
open(file = output_file, mode ='w', encoding ='utf-8',
        errors ='ignore').write('\n'.join(translated))


The input and output .txt files should be in the same folder as the Python file, else the whole path address needs to be provided. The input file should have the source text in any language. The user does not need to specify the language as Google detects that automatically. The target language, however, needs to be provided in the form of ISO 639-1 code (for example, the above code translates the text into English (coded by ‘en’)).

Natural Language Processing –

Enable the Cloud Natural Language API and download the ‘credentials.json’ file as explained here. You need to download the following package –

pip install google.cloud.language

Google’s Natural Language Processing API provides several methods for analyzing text. All of them are valuable aspects of Language Analysis. Sentiment Analysis: It analyses the text and understands the emotional opinion of the text. The output of Sentiment Analysis is a score within a range of -1 to 1, where -1 signifies 100% negative emotion, 1 signifies 100% positive emotion and 0 signifies neutral. It also outputs a magnitude with a range from 0 to infinity indicating the overall strength of emotion. 

Python3




import os
import io
from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types
 
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] =
    os.path.join(os.curdir, 'credentials.json')
 
client = language.LanguageServiceClient()
input_file = "filename_input.txt"
with io.open(input_file, "r") as inp:
    docu = inp.read()
     
text = types.Document(content = docu,
   type = enums.Document.Type.PLAIN_TEXT)
 
annotation = client.analyze_sentiment(document = text)
 
score = annotation.document_sentiment.score
magnitude = annotation.document_sentiment.magnitude
 
for index, sentence in enumerate(annotation.sentences):
    sentence_sentiment = sentence.sentiment.score
    print('Sentence #{} Sentiment score: {}'.format(
                     index + 1, sentence_sentiment))
 
print('Score: {}, Magnitude: {}'.format(score, magnitude))


The text should be present in the file titled filename_input.txt. The above code will analyze and publish the sentiment of the text line by line and will also provide the overall sentiment.

Clearly Positive -> Score: 0.8, Magnitude: 3.0
Clearly Negative -> Score: -0.6, Magnitude: 4.0
Neutral -> Score: 0.1, Magnitude: 0.0
Mixed -> Score: 0.0, Magnitude: 4.0

This is the approximate nature of emotions attached to the texts via Sentiment Analysis.   Entity Analysis: Entity Analysis provides information about entities in the text, which generally refer to named “things” such as famous individuals, landmarks, common objects, etc. 

Python3




import os
import io
from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types
 
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] =
    os.path.join(os.curdir, 'credentials.json')
 
client = language.LanguageServiceClient()
input_file = "filename_input.txt"
with io.open(input_file, "r") as inp:
    docu = inp.read()
     
text = types.Document(content = docu,
    type = enums.Document.Type.PLAIN_TEXT)
 
ent = client.analyze_entities(document = text)
 
entity = ent.entities
 
for e in entity:
    print(e.name, e.metadata, e, type, e.salience)


The above code will extract all entities from the above text, name its type, salience (i.e. the prominence of the entity) and its metadata (present mostly for proper nouns, along with the Wikipedia link for that entity)   Syntax Analysis: Syntax Analysis breaks up the given text into tokens (by default a series of words) and provides linguistic information about those tokens. 

Python3




import os
import io
from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types
 
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] =
     os.path.join(os.curdir, 'credentials.json')
 
client = language.LanguageServiceClient()
input_file = "filename_input.txt"
with io.open(input_file, "r") as inp:
    docu = inp.read()
     
text = types.Document(content = docu,
    type = enums.Document.Type.PLAIN_TEXT)
 
tokens = client.analyze_syntax(text).tokens
 
for token in tokens:
    speech_tag = enums.PartOfSpeech.Tag(token.part_of_speech.tag)
    print(u'{}: {}'.format(speech_tag.name, token.text.content))


The above code provides a list of all words and its Syntax, whether it is a noun, verb, pronoun, punctuation etc. For further information, visit Google Natural Language API documentation here. Thus Google Cloud APIs provides high functionality services which are easy to use, portable, short and clear. Note: Sometimes, the above programs will result in an error “ImportError: Cannot import name ‘cygrpc'” and problem arises when we try to install it using

pip install cygrpc
or
sudo apt-get install cygrpc

Instead use the following command :

python -m pip install grpcio --ignore-installed


Last Updated : 21 Apr, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads