BeautifulSoup – Remove the contents of tag
In this article, we are going to see how to remove the content tag from HTML using BeautifulSoup. BeautifulSoup is a python library used for extracting html and xml files.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning - Basic Level Course
BeautifulSoup: Our primary module contains a method to access a webpage over HTTP.
For installation run this command into your terminal:
pip install bs4
- First, we will import the required libraries.
- We will read the html file or text.
- We will feed the extracted text to the soup object.
- We will then find the required tag and then clear its element.
Step 1: We will initialize the program, import the libraries and read or create the HTML doc that we want soup.
Step 2: We will pass the retrieved text to the soup object and set the parser in this case we are using html parser. Other markups that can be used are xml or html5. Then we will mention the tag from which we have to remove the content.
Step 3: We will use .clear function. It clears the content of the mentioned tag.
Below is the full implementation: