How to extract a div tag and its contents by id with BeautifulSoup?

Beautifulsoup is a Python library used for web scraping. This powerful python tool can also be used to modify HTML webpages. This article depicts how beautifulsoup can be employed to extract a div and its content by its ID. For this, find() function of the module is used to find the div by its ID.

Approach:

Import module
Scrap data from a webpage
Parse the string scraped to HTML
Find the div with its ID
Print its content

Syntax : find(tag_name, **kwargs)

Parameters:

The tag_name argument tell Beautiful Soup to only find tags with given names. Text strings will be ignored, as will tags whose names that don’t match.

The **kwargs arguments are used to filter against each tag’s ‘id’ attribute.

Below is the implementation:

Example 1:

Python3

#importing module 

from bs4 import BeautifulSoup 

markup = '''<html><body><div id="container">Div Content</div></body></html>'''

soup = BeautifulSoup(markup, 'html.parser') 

#finding the div with the id 

div_bs4 = soup.find('div', id = "container") 

print(div_bs4.string)

Output:

Div Content

Example 2:

Python3

#importing module 

from bs4 import BeautifulSoup 

markup =markup = """ 

<!DOCTYPE> 
<html> 

  <head><title>Example</title></head> 

    <body> 

<p> 

        Nested div 

      </p> 

        <div id="first"> Div with ID first 

          <div id="second"> Div with id second 

          </div> 

        </div>  

    </body> 
</html> 
"""

# parsering string to HTML  

soup = BeautifulSoup(markup, 'html.parser') 

#finding the div with the id 

div_bs4 = soup.find('div', id = "second") 

print(div_bs4.string)

Output:

 Div with id second

Article Tags :

Python

Technical Scripter

Python BeautifulSoup

Python bs4-Exercises

Technical Scripter 2020