In this article, we are going to see how to Scrape Google Search Results using Python BeautifulSoup.
- bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.
pip install bs4
- requests: Requests allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python. To install this type the below command in the terminal.
pip install requests
- Import the beautifulsoup and request libraries.
- Make two strings with the default Google search URL, ‘https://google.com/search?q=’ and our customized search keyword.
- Concatenate these two strings to get our search URL.
- Fetch the URL data using requests.get(url), store it in a variable, request_result.
- Create a string and store the result of our fetched request, using request_result.text.
- Now we use BeautifulSoup to analyze the extracted page. We can simply create an object to perform those operations but beautifulsoup comes with a lot of in-built features to scrape the web. We have created a soup object first using beautifulsoup from the request-response
- We can do soup.find.all(h3) to grab all major headings of our search result, Iterate through the object and print it as a string.
Example 1: Below is the implementation of the above approach.
Let’s We can do soup.find.all(h3) to grab all major headings of our search result, Iterate through the object and print it as a string.
Example 2: Below is the implementation. In the form of extracting the city temperature using Google search:
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.