BeautifulSoup CSS selector – Selecting nth child

Last Updated : 12 Jan, 2024

In this article, we will see how beautifulsoup can be employed to select nth-child. For this, select() methods of the module are used. The select() method uses the SoupSieve package to use the CSS selector against the parsed document.

Syntax: nth-child() Selector

Syntax: select(“css_selector”)

CSS SELECTOR:

nth-of-type(n): Selects the nth paragraph child of the parent.

nth-child(n): Selects paragraph which is the nth child of the parent

Access Child Div Element in BeautifulSoup

There are various ways to access the second div BeautifulSoup. here we are discussing some generally used methods for accessing second div BeautifulSoup those are following.

By Extracting the 2nd <b> Element
By Extract Specific HTML Element

In the specified approach, the first step involves importing the necessary module to facilitate web scraping. Following this, data is extracted from a webpage using scraping techniques. The next step focuses on parsing the string obtained and converting it into HTML format for easier manipulation. To pinpoint specific elements within the HTML structure, the find() function is employed, enabling the identification of tags based on criteria such as class name, ID, or tag name.

Extracting the Second Element from HTML

In this example Python code utilizes the BeautifulSoup module to parse an HTML markup containing nested elements. It then finds a specific parent element with the class “coding” and prints the 2nd <b> element using both the nth-of-type and nth-child selectors. The result demonstrates different ways to locate and extract specific elements within the HTML structure using BeautifulSoup.

Python3

# importing module
from bs4 import BeautifulSoup
 
markup = """
<html>
    <head>
        <title>GEEKS FOR GEEKS EXAMPLE</title>
    </head>
    <body>
        <p class="1"><b>Geeks for Geeks</b></p>
 
        <p class="coding">A Computer Science portal for geeks.
            <h1>Heading</h1>
            <b class="gfg">Programming Articles</b>,
            <b class="gfg">Programming Languages</b>,
            <b class="gfg">Quizzes</b>;
        </p>
 
        <p class="coding">practice</p>
 
    </body>
</html>
    """
 
# parsering string to HTML
soup = BeautifulSoup(markup, 'html.parser')
 
parent = soup.find(class_="coding")
 
# assign n
n = 2
 
# print the 2nd <b> of parent
print(parent.select("b:nth-of-type("+str(n)+")"))
print()
 
# print the <b> which is the 2nd child of the parent
print(parent.select("b:nth-child("+str(n)+")"))

Output:

Extracting a Specific Element from a Webpage

In this example Python code utilizes the BeautifulSoup library to perform web scraping on the specified GeeksforGeeks webpage. It imports the necessary modules, requests the webpage content, and parses the HTML. The code then selects and prints the second <b> element within a specific class using both nth-of-type and nth-child methods.

Python3

# importing module
from bs4 import BeautifulSoup
import requests
 
# assign website
sample_website='https://www.geeksforgeeks.org/python-programming-language/'
page=requests.get(sample_website)
 
# parsering string to HTML
soup = BeautifulSoup(page.content, 'html.parser')
parent = soup.find(class_="wrapper")
 
# assign n
n = 1
 
# print the 2nd <b> of parent
print(parent.select("b:nth-of-type("+str(n)+")"))
print()
 
# print the <b> which is the 2nd child of the parent
print(parent.select("b:nth-child("+str(n)+")"))