Open In App

How to Handle duplicate attributes in BeautifulSoup ?

Sometimes while obtaining the information, are you facing any issue in handling the information received from duplicate attributes of the same tags? If YES, then read the article and clear all your doubts.

Once you have created the list to store the items, write the given below code.



Syntax: 

list=soup.find_all(“#Widget Name”, {“id”:”#Id name of widget in which you want to edit”})



After writing the following code, remove the attributes from the output and print the certain item you want from the list.

Approach:

Syntax:

base=os.path.dirname(os.path.abspath(‘#Name of Python file in which you are currently working’))

Syntax:

html=open(os.path.join(base, ‘#Name of HTML file from which you wish to read value’))

Syntax:

list=soup.find_all(“#Widget Name”, {“id”:”#Id name of widget in which you want to edit”})

Webpage in use:




<!DOCTYPE html>
<html>
 <head>
   Geeks For Geeks
 </head>
 <body>
 <div>
     <p id="vinayak">King</p>
  
     <p id="vinayak">Prince</p>
  
     <p id="vinayak">Queen</p>
  
 </div>
 <p id="vinayak">Princess</p>
  
  </body>
</html>

Program:




# Import the libraries beautifulsoup and os
from bs4 import BeautifulSoup as bs
import os
  
# Remove the last segment of the path
# Here replace the name of your python file with
# gfg4.py
base = os.path.dirname(os.path.abspath("gfg4.py"))
  
# Open the HTML in which you want to make 
# changes
html = open(os.path.join(base, 'gfg.html'))
  
# Parse HTML file in Beautiful Soup
soup = bs(html, 'html.parser')
  
# Create a list to store the items
list = [3]
  
# Finding all the elements inside div
# with paragraph having id: vinayak
list = soup.div.find_all("p", {"id": "vinayak"})
  
# Removing attributes from the output
for i in list:
    i.attrs = {}
  
# Printing the value Prince
print(list[1])
  
# Printing the value Queen
print(list[2])

Output:

<p>Prince</p>

<p>Queen</p>


Article Tags :