Open In App

Get all HTML tags with BeautifulSoup

Web scraping is a process of using bots like software called web scrapers in extracting information from HTML or XML content. Beautiful Soup is one such library used for scraping data through python. Beautiful Soup parses through the HTML content of the web page and collects it to provide iteration, searching and modification features on it. To provide these functionalities it works with a parser that converts the content to a parse tree. Using a parser you are comfortable with It’s fairly easy to crawl through the web pages using BeautifulSoup.  

To get all the HTML tags of a web page using the BeautifulSoup library first import BeautifulSoup and requests library to make a GET request to the web page.



Step-by-step Approach:




from bs4 import BeautifulSoup
import requests




# Assign URL
  
# Make a GET request to fetch the raw HTML content
html_content = requests.get(url).text




# Parse the html content using any parser 
soup = BeautifulSoup(html_content,"html.parser")




[tag.name for tag in soup.find_all()]

Below is the complete program:






# Import modules
from bs4 import BeautifulSoup
import requests
  
# Assign URL
  
# Make a GET request to fetch the raw HTML content
html_content = requests.get(url).text
  
# Parse the html content using any parser
soup = BeautifulSoup(html_content, "html.parser")
  
# Display HTML tags
[tag.name for tag in soup.find_all()]

Output:

['html',
 'head',
 'meta',
 'meta',
 'meta',
 'link',
 'meta',
 'meta',
 'meta',
 'meta',
 'meta',
 'script',
 'script',
 'link',
 'title',
 'link',
 'link',
 'script',
 'script']

Article Tags :