BeautifulSoup – Remove the contents of tag

Last Updated : 25 Feb, 2021

In this article, we are going to see how to remove the content tag from HTML using BeautifulSoup. BeautifulSoup is a python library used for extracting html and xml files.

Modules needed:

BeautifulSoup: Our primary module contains a method to access a webpage over HTTP.

For installation run this command into your terminal:

pip install bs4

Approach:

First, we will import the required libraries.
We will read the html file or text.
We will feed the extracted text to the soup object.
We will then find the required tag and then clear its element.

Step-by-step implementation:

Step 1: We will initialize the program, import the libraries and read or create the HTML doc that we want soup.

Python3

# Importing libraries 
from bs4 import BeautifulSoup 
  
# Reading the html text we want to parse 
text = "<html> <head><title> Welcome </title></head><body><h1>This is a test page</h1></body></html>"

Step 2: We will pass the retrieved text to the soup object and set the parser in this case we are using html parser. Other markups that can be used are xml or html5. Then we will mention the tag from which we have to remove the content.

Python3

# creating a soup 
soup = BeautifulSoup(text,"html.parser") 
  
# printing the content in h1 tag 
print(f"Content of h1 tag is: {soup.h1}")

Output:

Step 3: We will use .clear function. It clears the content of the mentioned tag.

Python3

# clearing the content of the tag 
soup.h1.clear() 
  
# printing the content in h1 tag after clearing 
print(f"Content of h1 tag after clearing: {soup.h1}")

Below is the full implementation:

Python3

# Importing libraries 
from bs4 import BeautifulSoup 
  
# Reading the html text we want to parse 
text = "<html> <head><title> Welcome </title></head><body><h1>This is a test page</h1></body></html>"
  
# creating a soup 
soup = BeautifulSoup(text,"html.parser") 
  
# printing the content in h1 tag 
print(f"Content of h1 tag is: {soup.h1}") 
  
# clearing the content of the tag 
soup.h1.clear() 
  
# printing the content in h1 tag after clearing 
print(f"Content of h1 tag after clearing: {soup.h1}")

Suggest improvement

Remove all style, scripts, and HTML tags using BeautifulSoup

HTML Cleaning and Entity Conversion | Python

Share your thoughts in the comments

Installing and Loading BeautifulSoup

Navigating the HTML structure With Beautiful Soup

Searching and Extract for specific tags With Beautiful Soup

Creating new HTML elements With Beautiful Soup

Modifying HTML with BeautifulSoup

Working with CSS selectors With Beautiful Soup

Handling cookies and sessions with BeautifulSoup

BeautifulSoup – Remove the contents of tag

Approach:

Step-by-step implementation:

Python3

Python3

Python3

Python3

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?