Skip to content
Related Articles
Get the best out of our app
GeeksforGeeks App
Open App
geeksforgeeks
Browser
Continue

Related Articles

How to remove empty tags using BeautifulSoup in Python?

Improve Article
Save Article
Like Article
Improve Article
Save Article
Like Article

Prerequisite: Requests, BeautifulSoup, strip

The task is to write a program that removes the empty tag from HTML code. In Beautiful Soup there is no in-built method to remove tags that has no content.

Module Needed:

  • bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.
pip install bs4
  • requests:  Requests allows you to send HTTP/1.1 requests extremely easily. This module also does not comes built-in with Python. To install this type the below command in the terminal.
pip install requests

Approach:

  • Get HTML Code
  • Iterate through each tag
    • Fetching text from the tag and remove whitespaces using the strip.
    • After removing whitespace, check If the length of the text is zero remove the tag from HTML code.

Example 1: Remove empty tag.

Python3




# Import Module
from bs4 import BeautifulSoup
  
# HTML Object
html_object = """
  
<p>
<p></p>
<strong>some<br>text<br>here</strong></p>
  
"""
  
# Get HTML Code
soup = BeautifulSoup( html_object , "lxml")
  
# Iterate each line
for x in soup.find_all():
  
    # fetching text from tag and remove whitespaces
    if len(x.get_text(strip=True)) == 0:
          
        # Remove empty tag
        x.extract()
  
# Print HTML Code with removed empty tags
print(soup)

Output:

<html><body><strong>sometexthere</strong>
</body></html>

Example 2: Remove empty tag from a given URL.

Python3




# Import Module
from bs4 import BeautifulSoup
import requests
  
# Page URL
  
# Page content from Website URL
page = requests.get( URL )
  
# Get HTML Code
soup = BeautifulSoup( page.content , "lxml" )
  
# Iterate each line
for x in soup.find_all():
  
    # fetching text from tag and remove whitespaces
    if len( x.get_text ( strip = True )) == 0:
  
        # Remove empty tag
        x.extract()
  
# Print HTML Code with removed empty tags
print(soup)

Output:


My Personal Notes arrow_drop_up
Last Updated : 26 Nov, 2020
Like Article
Save Article
Similar Reads
Related Tutorials