Get all HTML tags with BeautifulSoup

Last Updated : 25 Feb, 2021

Web scraping is a process of using bots like software called web scrapers in extracting information from HTML or XML content. Beautiful Soup is one such library used for scraping data through python. Beautiful Soup parses through the HTML content of the web page and collects it to provide iteration, searching and modification features on it. To provide these functionalities it works with a parser that converts the content to a parse tree. Using a parser you are comfortable with It’s fairly easy to crawl through the web pages using BeautifulSoup.

To get all the HTML tags of a web page using the BeautifulSoup library first import BeautifulSoup and requests library to make a GET request to the web page.

Step-by-step Approach:

Import required modules.

Python3

from bs4 import BeautifulSoup 
import requests

After importing the library now assign a URL variable with the URL of the web page and make a GET request to fetch the raw HTML content:

Python3

# Assign URL 
url = "https://www.geeksforgeeks.org/"
  
# Make a GET request to fetch the raw HTML content 
html_content = requests.get(url).text 

Now parse the HTML content:

Python3

# Parse the html content using any parser  
soup = BeautifulSoup(html_content,"html.parser") 

Now to get all the HTML tags of the web page run a loop for the .name attribute of the tag using the find_all() function:

Python3

[tag.name for tag in soup.find_all()]

Below is the complete program:

Python3

# Import modules 
from bs4 import BeautifulSoup 
import requests 
  
# Assign URL 
url = "https://www.geeksforgeeks.org/"
  
# Make a GET request to fetch the raw HTML content 
html_content = requests.get(url).text 
  
# Parse the html content using any parser 
soup = BeautifulSoup(html_content, "html.parser") 
  
# Display HTML tags 
[tag.name for tag in soup.find_all()] 

Output:

['html',
 'head',
 'meta',
 'meta',
 'meta',
 'link',
 'meta',
 'meta',
 'meta',
 'meta',
 'meta',
 'script',
 'script',
 'link',
 'title',
 'link',
 'link',
 'script',
 'script']

Suggest improvement

BeautifulSoup - Remove the contents of tag

Remove all style, scripts, and HTML tags using BeautifulSoup

Share your thoughts in the comments

Get all HTML tags with BeautifulSoup

Python3

Python3

Python3

Python3

Python3

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?