Sometimes, while extracting data from an HTML webpage, do you want to know how many paragraph tags are used in a given HTML document? Don’t worry we will discuss about this in this article.
Syntax:
print(len(soup.find_all("p")))
Approach:
Step 1: First, import the libraries, BeautifulSoup, and os.
from bs4 import BeautifulSoup as bs import os
Step 2: Now, remove the last segment of the path by entering the name of the Python file in which you are currently working.
base=os.path.dirname(os.path.abspath(‘#Name of Python file in which you are currently working’))
Step 3: Then, open the HTML file from which you want to read the value.
html=open(os.path.join(base, ‘#Name of HTML file from which you wish to read value’))
Step 4: Moreover, parse the HTML file in BeautifulSoup.
soup=bs(html, 'html.parser')
Step 5: Next, print a certain line if you want to.
print("Number of paragraph tags:")
Step 6: Finally, calculate and print the number of paragraph tags in the HTML document.
print(len(soup.find_all("p")))
Implementation:
Example 1
Let us consider the simple HTML webpage, which has numerous paragraph tags.
<!DOCTYPE html> < html >
< head >
Geeks For Geeks
</ head >
< body >
< div >
< p >King</ p >
< p >Prince</ p >
< p >Queen</ p >
</ div >
< p id = "vinayak" >Princess</ p >
</ body >
</ html >
|
For finding the number of paragraph tags in the above HTML webpage, implement the following code.
# Python program to get number of paragraph tags # of a given HTML document in Beautifulsoup # Import the libraries beautifulsoup # and os from bs4 import BeautifulSoup as bs
import os
# Open the HTML file html = open ( 'gfg.html' )
# Parse HTML file in Beautiful Soup soup = bs(html, 'html.parser' )
# Print a certain line print ( "Number of paragraph tags:" )
# Calculating and printing the # number of paragraph tags print ( len (soup.find_all( "p" )))
|
Output:
Example 2
In the below program, we will find the number of paragraph tags on a particular website.
# Python program to get number of paragraph tags # of a given Website in Beautifulsoup # Import the libraries beautifulsoup # and os from bs4 import BeautifulSoup as bs
import os
import requests
# Assign URL # Page content from Website URL page = requests.get(URL)
# Parse HTML file in Beautiful Soup soup = bs(page.content, 'html.parser' )
# Print a certain line print ( "Number of paragraph tags:" )
# Calculating and printing the # number of paragraph tags print ( len (soup.find_all( "p" )))
|
Output: