How to Remove tags using BeautifulSoup in Python?
Prerequisite- Beautifulsoup module
In this article, we are going to draft a python script that removes a tag from the tree and then completely destroys it and its contents. For this, decompose() method is used which comes built into the module.
Syntax:
Beautifulsoup.Tag.decompose()
Tag.decompose() removes a tag from the tree of a given HTML document, then completely destroys it and its contents.
Implementation:
Example 1:
Python3
from bs4 import BeautifulSoup
soup = BeautifulSoup(markup, 'html.parser' )
print ( "Before Decompose" )
print (soup.a)
new_tag = soup.a.decompose()
print ( "After decomposing:" )
print (new_tag)
|
Output:
Before Decompose
<a href=”https://www.geeksforgeeks.org/”>Welcome to <i>geeksforgeeks.com</i></a>
After decomposing:
None
Example 2: Implementation of given URL to scrape the HTML document.
Python3
from bs4 import BeautifulSoup
import requests
reqs = requests.get(url)
soup = BeautifulSoup(reqs.text, 'html.parser' )
print ( "Before Decomposing" )
print (soup)
result = soup.decompose()
print ( "After decomposing:" )
print (result)
|
Output:
Before Decomposing
<!DOCTYPE html>
<!–[if IE 7]>
<html class=”ie ie7″ lang=”en-US” prefix=”og: http://ogp.me/ns#”>
<![endif]–>
<!–[if IE 8]>
<html class=”ie ie8″ lang=”en-US” prefix=”og: http://ogp.me/ns#”>
<![endif]–>
<!–[if !(IE 7) | !(IE 8) ]><!–>
<html lang=”en-US” prefix=”og: http://ogp.me/ns#”>
<!–<![endif]–>
<head>
<meta charset=”utf-8″/>..
……
After decomposing:
None
Last Updated :
07 Feb, 2023
Like Article
Save Article
Share your thoughts in the comments
Please Login to comment...