Open In App

How To use Cloud Speech-To-Text For Speech Recognition On GCP?

Google Cloud Platform is one of the famous cloud service providers in the market. With cloud features focusing on deployment and storage, GCP also provides features like speech recognition. This powerful and easy-to-use service is called Cloud speech-to-text. This functionality enables developers to convert spoken language into text with high accuracy. Speech-to-text can be integrated with applications to provide transcriptions, and businesses can use this to enhance their accessibility. In this article, we will be learning about this Cloud Speech-to-Text provided by GCP and how we can use this feature to get transcription of Speech.

Key Terminologies

Step To Use Cloud Speech-To-Text For Speech Recognition On GCP

Step 1: Open GCP Cloud Console

Step 2: Enable Cloud Speech-To-Text API





Step 3: Create A Service Account

Step 4: Create JSON Key

For Key Type select JSON and it will create the Key and a JSON file will be download automatically.

Step 5: Install Required Packages

pip install --upgrade google-cloud-speech


Output

Step 6: Import Library

 from google.cloud import speech


Step 7: Connect With GCP

client = speech.SpeechClient.from_service_account_file('[file_name].json')


Step 8: Select Speech File

Step 9: Perform Speech-to-Text Operation

audio_file = speech.RecognitionAudio(content = mp3_data)


config = speech.RecognitionConfig(
sample_rate_hertz=44100,
enable_automatic_punctuation=True,
language_code='en-US'
)


Output:

Step 10: Check Result

print(response)


Output:

The response has the following details including,

Here, we need only the transcription as output, so let’s format the print statement to get only the transcription.

for result in response.results:
print("Transcript : {} ".format(result.alternatives[0].transcript))


Output:

Conclusion

Google Cloud Speech-to-Text API offers a powerful and reliable solution for converting audio data into text with high accuracy. By using this Cloud feature, developers can easily integrate speech recognition functionality in their application. We can use this feature for cases like, transcription, voice-controlled interfaces, sentiment analysis and more. Google Cloud Speech-to-Text API provides the tools and functionality to provide accurate and efficient speech recognition as per requirements.

Speech Recognition In GCP – FAQ’s

How To Enable Cloud Speech-to-text Api In GCP?

To enable Cloud Speech-to-Text API in Google Cloud Platform, go to “API & Services” in Cloud Console and search for the Cloud Speech-to-Text API then click on the search result and in the next page, click on enable.

How Accurate Is Cloud Speech-to-text ?

Generally Cloud Speech-to-Text achieves high accuracy rates and Google is continuously improving it. However, the accuracy depends on various factors including, audio quality, background noise, accent, and language complexity.

How Many Languages Are Known To Cloud Speech-to-text ?

Cloud Speech-to-Text supports multiple language and variants including regional accents and dialects. It currently supports over 125 languages. Users can list up to three languages for automatic language recognition.

Is Cloud Speech-to-text Is A Free Service Or Paid ?

Cloud Speech-to-Text is a paid service provided by Google Cloud Platform. Pricing for Cloud Speech-to-Text is based on the duration of the audio and other factors. There are different pricing based on the number of seconds processed per month.

How To Use Cloud Speech-to-text For Free?

There is no option to use Cloud Speech-to-Text for free. However you can get a free trial of Google Cloud Platform which offers $300 credit to use any Google Cloud service, include Cloud Speech-to-Text.


Article Tags :