Find the length of the text of the first given tag using BeautifulSoup
Last Updated :
09 Jan, 2023
In this article, we are going to Find the length of the text of the first given tag using BeautifulSoup.
Let us see a sample example. Using ‘html.parser’ it is parsed and the tag value ‘h2’ length is calculated in the below code soup = BeautifulSoup(html_doc, ‘html.parser’) specifies that entire given HTML document is parsed using html.parser. The soup.find(‘h2’).text method takes any of the valid HTML tags that are present inside the given document and searches for it. If the tags are present, it will get the next set of operations to get done. In case if the specified tag is not present, it will throw “Attribute Error”
Here in the example, we care calculating length, hence used len() function. The len() function returns the number of items in an object and in the case of a string, it returns the number of characters enclosed in that string.
Example 1:
In this example, as we have tried to get a text value present inside “h2”, it is just calculating the number of characters enclosed in that string.
Python3
from bs4 import BeautifulSoup
html_doc =
soup = BeautifulSoup(html_doc, 'html.parser' )
print ( "Length of the text of the first <h2> tag:" )
print ( len (soup.find( 'h2' ).text))
|
Output:
Length of the text of the first <h2> tag:
59
The soup.find().text statement retrieves the text enclosed between a particular tag. Then the len() function returns the length of the text.
Example 2 :
Get the length of all HTML tags present inside the given HTML.
Python3
from bs4 import BeautifulSoup
html_doc =
soup = BeautifulSoup(html_doc, 'html.parser' )
for tag in soup.findAll( True ):
print (tag.name, " : " , len (soup.find(tag.name).text))
|
Output:
The findAll(True) method until there are tags, it will find them. The for tag in soup.findAll(True): statement iterates all the tags that are found out and, finally the statement print(tag.name, ” : “, len(soup.find(tag.name).text)) displays the tag one by one as well as its length.
If we explicitly want to get the first tag means, in the above code, we need to put a break statement after the print statement.
Python3
for tag in soup.findAll( True ):
print (tag.name, " : " , len (soup.find(tag.name).text))
break
|
Output:
html : 270
Example 3:
In this example, we will find the text length of a particular given tag from an HTML document.
Python3
from bs4 import BeautifulSoup
html_doc =
soup = BeautifulSoup(html_doc, 'html.parser' )
tag = "html"
print ( "Length of the text of" , tag, "tag is:" ,
len (soupResults.find(tag).text))
|
Output:
Length of the text of html tag is: 5062
Example 4:
Now let us see how to get a tag and their text lengths from a web page like monster. As we need to get data from this request URL, we need to include the requests module to achieve the same.
Python3
from bs4 import BeautifulSoup
import requests
monsterPage = requests.get(monsterPageURL)
soupResults = BeautifulSoup(monsterPage.content, 'html.parser' )
tag = "title"
print ( "Length of the text of" ,tag, "tag is:" ,
len (soupResults.find(tag).text))
|
Output:
Length of the text of title tag is: 57
Share your thoughts in the comments
Please Login to comment...