Get all HTML tags with BeautifulSoup

Web scraping is a process of using bots like software called web scrapers in extracting information from HTML or XML content. Beautiful Soup is one such library used for scraping data through python. Beautiful Soup parses through the HTML content of the web page and collects it to provide iteration, searching and modification features on it. To provide these functionalities it works with a parser that converts the content to a parse tree. Using a parser you are comfortable with It’s fairly easy to crawl through the web pages using BeautifulSoup.

To get all the HTML tags of a web page using the BeautifulSoup library first import BeautifulSoup and requests library to make a GET request to the web page.

Step-by-step Approach:

Import required modules.

Python3

from bs4 import BeautifulSoup 

import requests

After importing the library now assign a URL variable with the URL of the web page and make a GET request to fetch the raw HTML content:

Python3

# Assign URL 

url = "https://www.geeksforgeeks.org/"

# Make a GET request to fetch the raw HTML content 

html_content = requests.get(url).text

Now parse the HTML content:

Python3

# Parse the html content using any parser  

soup = BeautifulSoup(html_content,"html.parser") 

Now to get all the HTML tags of the web page run a loop for the .name attribute of the tag using the find_all() function:

Python3

[tag.name for tag in soup.find_all()]

Below is the complete program:

Python3

# Import modules 

from bs4 import BeautifulSoup 

import requests 

# Assign URL 

url = "https://www.geeksforgeeks.org/"

# Make a GET request to fetch the raw HTML content 

html_content = requests.get(url).text 

# Parse the html content using any parser 

soup = BeautifulSoup(html_content, "html.parser") 

# Display HTML tags 

[tag.name for tag in soup.find_all()]

Output:

['html',
 'head',
 'meta',
 'meta',
 'meta',
 'link',
 'meta',
 'meta',
 'meta',
 'meta',
 'meta',
 'script',
 'script',
 'link',
 'title',
 'link',
 'link',
 'script',
 'script']

Article Tags :

Python

Python BeautifulSoup