Find tags by CSS class using BeautifulSoup
In this article, we will discuss how to find tags by CSS using BeautifulSoup. We are given an HTML document, we need to find and extract tags from the document using the CSS class.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning - Basic Level Course
HTML Document: <html> <head> <title> Geeksforgeeks </title> </head> <body> <div class="ext" >Extract this tag</div> </body> </html> Output: <div class="ext" >Extract this tag</div>
- bs4: It is a python library used to scrape data from HTML, XML, and other markup languages.
Make sure you have pip installed on your system.
Run the following command in the terminal to install this library-
pip install bs4 or pip install beautifulsoup4
- Import bs4 library
- Create an HTML doc
- Parse the content into a BeautifulSoup object
- Searching by CSS class – The name of the CSS attribute, “class”, is a reserved word in Python. The compiler gives syntax error if class is used as a keyword argument. We can search CSS class using the keyword argument class_
We can pass class_ a string, a regular expression, a function, or True.
- find_all() with keyword argument class_ is used to find all the tags with the given CSS class
If we need to find only one tag then, find() is used
- Print the extracted tags.
Example 1: Find the tag using find() method
Example 2: Find all the tags using find_all() method
Example 3: Finding tags by CSS class using Regular Expressions.
<td class="table-row"> This is row 2 </td> <td class="table-row"> This is row 4 </td>
Above two tags class name ends with “row”. Therefore, they are extracted. Other tags class name doesn’t end with “row”. Therefore, they are not extracted.
Example 4: Finding tags by CSS class using the user-defined function.
Example 5: Finding tags by CSS class from a website