Open In App

How to use Xpath with BeautifulSoup ?

Prerequisites: Beautifulsoup

In this article, we will see how to use Xpath with BeautifulSoup. Getting data from an element on the webpage using lxml requires the usage of Xpaths. XPath works very much like a traditional file system



Module needed and installation:

First, we need to install all these modules on our computer.

pip install bs4
pip install lxml
pip install requests

Getting data from an element on the webpage using lxml requires the usage of Xpaths.



Using XPath

XPath works very much like a traditional file system.

To access file 1,

C:/File1

Similarly, To access file 2,

C:/Documents/User1/File2

To find the XPath for a particular element on a page:

Approach

Note: If XPath is not giving you the desired result copy the full XPath instead of XPath and the rest other steps would be the same.

Given below is an example to show how Xpath can be used with Beautifulsoup

Program:




from bs4 import BeautifulSoup
from lxml import etree
import requests
  
  
  
HEADERS = ({'User-Agent':
            'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 \
            (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36',\
            'Accept-Language': 'en-US, en;q=0.5'})
  
webpage = requests.get(URL, headers=HEADERS)
soup = BeautifulSoup(webpage.content, "html.parser")
dom = etree.HTML(str(soup))
print(dom.xpath('//*[@id="firstHeading"]')[0].text)

Output:

Nike, Inc.
Article Tags :