Open In App

How to get the next page on BeautifulSoup?

Last Updated : 16 May, 2021
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we are going to see how to Get the next page on beautifulsoup.

Modules Needed

  • BeautifulSoup: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. To install this module type the below command in the terminal.
pip install bs4
  • requests: This library allows you to send HTTP/1.1 requests extremely easily. To install this module type the below command in the terminal.
pip install requests

Approach:

Get the next page on beautifulsoup means first we will scrap one-page content and if many links are given on the page, and we want to scrap them also. We can get the next page first we will scrap the sample website after that any other links find, and we will call again requests. Get method for that page and will create a soup of that also. So this way we can get to the next page on beautifulsoup.

Let’s execute the script step-by-step :

Step 1: Import all dependence

from bs4 import BeautifulSoup
import requests

Step 2: We need to request the page URL with requests.

page=requests.get(sample_website)

Step 3: With the help of beautifulsoup method and HTML parser we will create a soup of the page.

soup = BeautifulSoup(page, 'html.parser')

Step 4:

We will search in the parse tree and find the link. If we want that URL, then with the help of the requests module and beautiful module we will again create the soup of the next page hence we can get the next page on beautifulsoup.

Python3




for i in soup.find_all('a', href = True):
    
  # check all link which is contain
  # "www.geeksforgeeks.org" string 
  if("www.geeksforgeeks.org" in i['href']):
      
    # call get method to request next url
    nextpage = requests.get(i['href'])
      
    # create soup for next url
    nextsoup = BeautifulSoup(nextpage.content, 'html.parser')
      
    # we can scrap any thing of the
    # next page here we are scraping title of 
    # nexturl page string
    print("next url title : ",nextsoup.find('title').string)


Below is the full Implementation:

Python3




from bs4 import BeautifulSoup
import requests
  
# sample website
  
# call get method to request the page
page=requests.get(sample_website)
  
# with the help of BeautifulSoup
# method and html parser created soup
soup = BeautifulSoup(page.content, 'html.parser')
  
# With the help of find_all
# method perform searching in parser tree
for i in soup.find_all('a', href = True):
    
  # check all link which is contain
  # "www.geeksforgeeks.org" string 
  if("www.geeksforgeeks.org" in i['href']):
      
    # call get method to request next url
    nextpage = requests.get(i['href'])
      
    # create soup for next url
    nextsoup = BeautifulSoup(nextpage.content, 'html.parser')
      
    # we can scrap any thing of the
    # next page here we are scraping title of 
    # nexturl page string
    print("next url title : ",nextsoup.find('title').string)


Output:

next url title :  GeeksforGeeks | A computer science portal for geeks
next url title :  Analysis of Algorithms | Set 1 (Asymptotic Analysis) - GeeksforGeeks
next url title :  Analysis of Algorithms | Set 2 (Worst, Average and Best Cases) - GeeksforGeeks
next url title :  Analysis of Algorithms | Set 3 (Asymptotic Notations) - GeeksforGeeks
next url title :  Analysis of algorithms | little o and little omega notations - GeeksforGeeks
next url title :  Lower and Upper Bound Theory - GeeksforGeeks
next url title :  Analysis of Algorithms | Set 4 (Analysis of Loops) - GeeksforGeeks
next url title :  Analysis of Algorithm | Set 4 (Solving Recurrences) - GeeksforGeeks
next url title :  Analysis of Algorithm | Set 5 (Amortized Analysis Introduction) - GeeksforGeeks
next url title :  What does 'Space Complexity' mean? - GeeksforGeeks
next url title :  Pseudo-polynomial Algorithms - GeeksforGeeks
next url title :  Polynomial Time Approximation Scheme - GeeksforGeeks
next url title :  A Time Complexity Question - GeeksforGeeks
................................................................. 


Similar Reads

BeautifulSoup object - Python Beautifulsoup
BeautifulSoup object is provided by Beautiful Soup which is a web scraping framework for Python. Web scraping is the process of extracting data from the website using automated tools to make the process faster. The BeautifulSoup object represents the parsed document as a whole. For most purposes, you can treat it as a Tag object. Syntax: BeautifulS
2 min read
How to get Rank of page in google search results using BeautifulSoup ?
In this article, we will learn How to get Google Page Ranking by searching a keyword using Python. Let's understand the basics of Google ranking then we proceed with its finding using Python. Google Ranking Google keyword ranking is the position where the website is present in Google Search when a user searches the keyword. In other words, Google s
5 min read
Get tag name using Beautifulsoup in Python
Prerequisite: Beautifulsoup Installation Name property is provided by Beautiful Soup which is a web scraping framework for Python. Web scraping is the process of extracting data from the website using automated tools to make the process faster. Name object corresponds to the name of an XML or HTML tag in the original document. Syntax: tag.name Para
1 min read
Get a list of all the heading tags using BeautifulSoup
In order to print all the heading tags using BeautifulSoup, we use the find_all() method. The find_all method is one of the most common methods in BeautifulSoup. It looks through a tag and retrieves all the occurrences of that tag. Syntax: find_all(name, attrs, recursive, string, limit, **kwargs) An HTML document consists of the following tags - h1
2 min read
Get all HTML tags with BeautifulSoup
Web scraping is a process of using bots like software called web scrapers in extracting information from HTML or XML content. Beautiful Soup is one such library used for scraping data through python. Beautiful Soup parses through the HTML content of the web page and collects it to provide iteration, searching and modification features on it. To pro
2 min read
Get data inside a button tag using BeautifulSoup
Sometimes while working with BeautifulSoup, are you stuck at the point where you have to get data inside a button tag? Don't worry. Just read the article and get to know how you can do the same. For instance, consider this simple page source having a button tag. C/C++ Code <!DOCTYPE html> <html lang="en"> <head> <meta
2 min read
BeautifulSoup - Error Handling
Sometimes, during scraping data from websites we all have faced several types of errors in which some are out of understanding and some are basic syntactical errors. Here we will discuss on types of exceptions that are faced during coding the script. Error During Fetching of Website When we are fetching any website content we need to aware of some
4 min read
BeautifulSoup - Scraping List from HTML
Prerequisite: RequestsBeautifulSoup Python can be employed to scrap information from a web page. It can also be used to retrieve data provided within a specific tag, this article how list elements can be scraped from HTML. Module Needed: bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not co
2 min read
Insert tags or strings immediately before and after specified tags using BeautifulSoup
BeautifulSoup is a Python library that is used for extracting data out of markup languages like HTML, XML...etc. For example let us say we have some web pages that needed to display relevant data related to some research like processing information such as date or address but that do not have any way to download it, in such cases BeautifulSoup come
2 min read
Beautifulsoup Installation - Python
Beautiful Soup is a Python library for pulling data out of HTML and XML files. It works with your favorite parser to provide idiomatic ways of navigating, searching, and modifying the parse tree. It commonly saves programmers hours or days of work. The latest Version of Beautifulsoup is v4.9.3 as of now. PrerequisitesPythonPip How to install Beauti
1 min read
Article Tags :
Practice Tags :