Open In App

How to Scrape Paragraphs using Python?

Prerequisite: Implementing Web Scraping in Python with BeautifulSoup

In this article, we are going to see how we extract all the paragraphs from the given HTML document or URL using python.



Module Needed:

pip install bs4
pip install requests

Approach:



Code:




# import module
from bs4 import BeautifulSoup
  
# Html doc
html_doc = """
<html>
<head>
<title>Geeks</title>
</head>
<body>
<h2>paragraphs</h2>
  
<p>Welcome geeks.</p>
  
  
<p>Hello geeks.</p>
  
</body>
</html>
"""
soup = BeautifulSoup(html_doc, 'html.parser')
  
# traverse paragraphs from soup
for data in soup.find_all("p"):
    print(data.get_text())

Output:

Welcome geeks.
Hello geeks.

Now Lets Extract Paragraphs from the given URL.

Code:




# import module
import requests
import pandas as pd
from bs4 import BeautifulSoup
  
# link for extract html data
def getdata(url):
    r = requests.get(url)
    return r.text
  
htmldata = getdata("https://www.geeksforgeeks.org/")
soup = BeautifulSoup(htmldata, 'html.parser')
data = ''
for data in soup.find_all("p"):
    print(data.get_text())

Output:


Article Tags :