Python BeautifulSoup Navigating tree sideways
Last Updated :
01 Nov, 2022
In this article, we will see how to navigate the beautifulsoup parse tree sideways. Navigating sideways means that the tags are on the same level. See the below example to get a better idea.
<a>
<b></b>
<c></c>
</a>
In the above example, the tags <b> and <c> are at the same level.
Installation of Required Modules:
bs4: We need to manually install the BeautifulSoup library in our machines as it is not provided by default in Python language Configuration. So let us install it by running the below command in our systems :
pip install bs4
lxml: lxml is a mature bonding between pythonic libxml2 and libxslt libraries, with help of ElementTree API, it provides safe and convenient access to those libraries.
pip install lxml
Let’s understand with implementation:
Prettify(): Prettify() function in BeautifulSoup enables us to observe how nesting of tags is done in document.
Syntax: (BeautifulSoup Variable).prettify()
Example :
Python3
import bs4
sibling_soup = bs4.BeautifulSoup("<a><b>Welcome to Geekforgeeks< / b>\
<c>Hello geeks< / c>< / b>< / a>", 'html.parser' )
print (sibling_soup.prettify())
|
Output:
<a>
<b>
Welcome to Geekforgeeks
</b>
<c>
Hello geeks
</c>
</a>
Navigating sideways
We can navigate sideways in a document using .next_sibling and .previous_sibling of BeautifulSoup in Python, these two functions in python provide us to navigate between tags that are in the same level of the tree.
Let us get a better insight into the concept through a proven example:
Consider a sample document :
Python3
import bs4
sibling_of_soup = bs4.BeautifulSoup("<a><b>CPPSecrets< / b><c><strong>\
C + + Python Professional HandBook Guide< / strong>< / b>< / a>", 'lxml' )
print (sibling_of_soup.prettify())
|
Output:
In the above code, we can clearly notice that <b> and <c> tags are on the same level and also they are both children to the same tag hence, we can classify them as siblings.
Now, we can navigate between the siblings <b> and <c> tags as they both are siblings by using:
- .next_sibling()
- .previous_sibling:
1. Navigating using .next_sibling :
Python3
import bs4
sibling_of_soup = bs4.BeautifulSoup("<a><b>CPPSecrets< / b><c><strong>\
C + + Python Professional HandBook Guide< / strong>< / b>< / a>", 'lxml' )
print (sibling_of_soup.b.next_sibling)
|
Output:
In the above code, gives us the following output i.e the item in the c tag as the next sibling for the b tag is c hence, the item in c tag will be navigated and printed.
If we write a print statement for c tag like :
Python3
import bs4
sibling_of_soup = bs4.BeautifulSoup("<a><b>CPPSecrets< / b><c><strong>\
C + + Python Professional HandBook Guide< / strong>< / b>< / a>", 'lxml' )
print (sibling_of_soup.c.next_sibling)
|
Output:
In the above code, the output generated is “None” as there is no tag present after c.
2. Navigating Using .previous_sibling:
Python3
import bs4
sibling_of_soup = bs4.BeautifulSoup("<a><b>CPPSecrets< / b><c><strong>\
C + + Python Professional
print (sibling_of_soup.c.previous_sibling)
print (sibling_of_soup.b.previous_sibling)
|
Output:
In the code, .previous_sibling on c tag, it generates an item in b tag as the previous sibling tag of it is b, but if we implement .previous_sibling to b tag it generates the output “None” as there is no sibling which occurred previous to b tag.
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...