Skip to content
Related Articles

Related Articles

Retrieve children of the html tag using BeautifulSoup
  • Last Updated : 15 Mar, 2021

Prerequisites: Beautifulsoup

Beautifulsoup is a Python module that is used for web scraping. This article discusses how children tags of given HTML tags can be scraped and displayed.

Sample Website:https://www.geeksforgeeks.org/caching-page-tables/

For First Child:

Approach

  • Import module
  • Pass the URL
  • Request page
  • Display the first child using findChild() function

Syntax:



findChild()

Example:

Python3




from bs4 import BeautifulSoup
import requests
  
# sample web page
  
# call get method to request that page
page = requests.get(sample_web_page)
  
# with the help of beautifulSoup and html parser create soup
soup = BeautifulSoup(page.content, "html.parser")
  
child_soup = soup.find('p')
  
print("child :  ", child_soup.findChild())

Output:

child :   <a href=”https://www.geeksforgeeks.org/paging-in-operating-system/” rel=”noopener” target=”_blank”>Paging</a>

For all children:

For Retrieving children of the HTML tag we have to option either we use .children or .contents. The difference between children and contents is children do not take any memory it gives us an iterable list and the contents give the child tag, but it uses the memory. For large HTML files use children is a better option and for storing value need contents will be better.

Approach

  • Import module
  • Pass the website URL
  • Request page
  • Use either keyword to display the child tags

Using .children:



For Retrieve all the children .children will used.

Example:

Python3




from bs4 import BeautifulSoup
import requests
  
# sample web page
  
# call get method to request that page
page = requests.get(sample_web_page)
  
# with the help of beautifulSoup and html parser create soup
soup = BeautifulSoup(page.content, "html.parser")
child_soup = soup.find('p')
  
for i in child_soup.children:
    print("child :  ", i)

Output:

child :   <a href=”https://www.geeksforgeeks.org/paging-in-operating-system/” rel=”noopener” target=”_blank”>Paging</a>

child :    is a memory management scheme which allows physical address space of a process to be non-contiguous. The basic idea of paging is to break physical memory into fixed-size blocks called 

child :   <strong>frames</strong>

child :    and logical memory into blocks of same size called 

child :   <strong>pages</strong>

child :   . While executing the process the required pages of that process are loaded into available frames from their source that is disc or any backup storage device.

Using .contents

It will also return all the child tag and store them in memory.

Example

Python3




from bs4 import BeautifulSoup
import requests
  
# sample web page
  
# call get method to request that page
page = requests.get(sample_web_page)
  
# with the help of beautifulSoup and html parser create soup
soup = BeautifulSoup(page.content, "html.parser")
  
child_soup = soup.find('p')
  
print("child :  ", child_soup.contents)

Output

child :   [<a href=”https://www.geeksforgeeks.org/paging-in-operating-system/” rel=”noopener” target=”_blank”>Paging</a>, ‘ is a memory management scheme which allows physical address space of a process to be non-contiguous. The basic idea of paging is to break physical memory into fixed-size blocks called ‘, <strong>frames</strong>, ‘ and logical memory into blocks of same size called ‘, <strong>pages</strong>, ‘. While executing the process the required pages of that process are loaded into available frames from their source that is disc or any backup storage device.’]

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course

My Personal Notes arrow_drop_up