Converting HTML to Text with BeautifulSoup
Last Updated :
16 Apr, 2021
Many times while working with web automation we need to convert HTML code into Text. This can be done using the BeautifulSoup. This module provides get_text() function that takes HTML as input and returns text as output.
Example 1:
Python3
from bs4 import BeautifulSoup
gfg = BeautifulSoup("<b>Section < / b><br / >BeautifulSoup<ul>\
<li>Example <b> 1 < / b>< / li>")
res = gfg.get_text()
print (res)
|
Output:
Section BeautifulSoupExample 1
Example 2: This example extracts data from the live website then converts it into text. In this example, we used the request module from urllib library to read HTML data from URL.
Python3
from bs4 import BeautifulSoup
from urllib import request
gfg = BeautifulSoup(request.urlopen(url).read())
bodyHtml = gfg.find( 'article' , { 'class' : 'content' })
res = bodyHtml.get_text()
print (res)
|
Output:
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...