MS Excel is a powerful tool for handling huge amounts of tabular data. It can be particularly useful for sorting, analyzing, performing complex calculations and visualizing data. In this article, we will discuss how to extract a table from a webpage and store it in Excel format.
Step #1: Converting to Pandas dataframe
Pandas is a Python library used for managing tables. Our first step would be to store the table from the webpage into a Pandas dataframe. The function
read_html() returns a list of dataframes, each element representing a table in the webpage. Here we are assuming that the webpage contains a single table.
0 1 2 3 4 0 ROLL_NO NAME ADDRESS PHONE AGE 1 1 RAM DELHI 9455123451 18 2 2 RAMESH GURGAON 9652431543 18 3 3 SUJIT ROHTAK 9156253131 20 4 4 SURESH DELHI 9156768971 18
Step #2: Storing the Pandas dataframe in an excel file
For this, we use the to_excel() function of Pandas, passing the filename as a parameter.
In case of multiple tables on the webpage, we can change the index number from 0 to that of the required table.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course