Open In App

NLP – Expand contractions in Text Processing

Last Updated : 21 Feb, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

Text preprocessing is a crucial step in NLP. Cleaning our text data in order to convert it into a presentable form that is analyzable and predictable for our task is known as text preprocessing. In this article, we are going to discuss contractions and how to handle contractions in text.

What are contractions?

Contractions are words or combinations of words that are shortened by dropping letters and replacing them by an apostrophe.

Nowadays, where everything is shifting online, we communicate with others more through text messages or posts on different social media like Facebook, Instagram, Whatsapp, Twitter, LinkedIn, etc. in the form of texts. With so many people to talk, we rely on abbreviations and shortened form of words for texting people.

For example I’ll be there within 5 min. Are u not gng there? Am I mssng out on smthng? I’d like to see u near d park.

In English contractions, we often drop the vowels from a word to form the contractions. Removing contractions contributes to text standardization and is useful when we are working on Twitter data, on reviews of a product as the words play an important role in sentiment analysis.

How to expand contractions?

1. Using contractions library

First, install the library. You can try this library on Google colab as installing the library becomes super smooth.

Using pip:

!pip install contractions

In Jupyter notebook:

import sys  
!{sys.executable} -m pip install contractions

Code 1:  For expanding contractions using contractions library

Python3




# import library
import contractions
# contracted text
text = '''I'll be there within 5 min. Shouldn't you be there too?
          I'd love to see u there my dear. It's awesome to meet new friends.
          We've been waiting for this day for so long.'''
 
# creating an empty list
expanded_words = []   
for word in text.split():
  # using contractions.fix to expand the shortened words
  expanded_words.append(contractions.fix(word))  
   
expanded_text = ' '.join(expanded_words)
print('Original text: ' + text)
print('Expanded_text: ' + expanded_text)


Output:

Original text: I'll be there within 5 min. Shouldn't you be there too? 
          I'd love to see u there my dear. It's awesome to meet new friends.
          We've been waiting for this day for so long.
Expanded_text: I will be there within 5 min. should not you be there too? 
          I would love to see you there my dear. it is awesome to meet new friends. 
          we have been waiting for this day for so long.

Removing contractions before forming word vectors helps in dimensionality reduction.

Code 2: Simply using contractions.fix to expand the text.

Python3




text = '''She'd like to know how I'd done that!
          She's going to the park and I don't think I'll be home for dinner.
          Theyre going to the zoo and she'll be home for dinner.'''
 
contractions.fix(text)


Output:

'she would like to know how I would done that! 
 she is going to the park and I do not think I will be home for dinner.
 they are going to the zoo and she will be home for dinner.'

Contractions can also be handled using other techniques like dictionary mapping, and also using pycontractions library. You can refer to the documentation of pycontractions library for learning more about this: https://pypi.org/project/pycontractions/



Similar Reads

Processing text using NLP | Basics
In this article, we will be learning the steps followed to process the text data before using it to train the actual Machine Learning Model. Importing Libraries The following must be installed in the current working environment: NLTK Library: The NLTK library is a collection of libraries and programs written for processing of English language writt
2 min read
NLP | Parallel list processing with execnet
This article presents a pattern for using execnet to process a list in parallel. It's a function pattern for mapping each element in the list to a new value, using execnet to do the mapping in parallel. In the code given below, integers are simply doubled, any pure computation can be performed. Given is the module, which will be executed by execnet
3 min read
Top 7 Applications of NLP (Natural Language Processing)
In the past, did you ever imagine that you could talk to your phone and get things done? Or that your phone would talk back to you! This has become a pretty normal thing these days with Siri, Alexa, Google Assistant, etc. You can ask any possible questions ranging from “What’s the weather outside” to “What’s your favorite color?” from Siri and you’
8 min read
Top 5 Industries Impacted By Natural Language Processing (NLP) Trends
Natural Language Processing (NLP) has been done by the human brain for ages and is now being done by computers since the 1950s. If you think about it, the whole process of processing the language is quite complicated. Scientists and developers have been trying to make computers replicate what the human brain can do in minutes if not seconds. Natura
5 min read
Natural Language Processing (NLP) Pipeline
Natural Language Processing is referred to as NLP. It is a subset of artificial intelligence that enables machines to comprehend and analyze human languages. Text or audio can be used to represent human languages. The natural language processing (NLP) pipeline refers to the sequence of processes involved in analyzing and understanding human languag
25 min read
Natural Language Processing(NLP) VS Programming Language
In the world of computers, there are mainly two kinds of languages: Natural Language Processing (NLP) and Programming Languages. NLP is all about understanding human language while programming languages help us to tell computers what to do. But as technology grows, these two areas are starting to overlap in cool ways, changing how we interact with
4 min read
Top 5 PreTrained Models in Natural Language Processing (NLP)
Pretrained models are deep learning models that have been trained on huge amounts of data before fine-tuning for a specific task. The pre-trained models have revolutionized the landscape of natural language processing as they allow the developer to transfer the learned knowledge to specific tasks, even if the tasks differ from the original training
7 min read
Top 12 AI Tools for NLP (Natural Language Processing ): 2024
Natural Language Processing (NLP) is a form of computation concerned with free AI Tools for NLP whereby any form of signal, statistics, or machine learning program from human language combines them into text or voice data. AI Tools for NLP perform a set of functionalities such as processing data on its own and understanding the context with the gen
11 min read
Natural Language Processing (NLP) Job Roles
In recent years, the discipline of Natural Language Processing(NLP) has experienced great growth and development and has already impacted the world of people with computers and will influence in the future the technological world. Nowadays professionals of NLP are sought-after but almost any industry since AI implementation is spread widely. [capti
10 min read
Top Natural Language Processing (NLP) Projects
Natural Language Processing (NLP) is a branch of AI that focuses on the interaction between human language and computers. It is an established and emerging field within Artificial Intelligence. NLP's presence is evident in various domains, including voice assistants, sentiment analysis, language recognition, translation, spell correction, and autoc
4 min read