Scrapping Weather prediction Data using Python and BS4
This article revolves around scrapping weather prediction d data using python and bs4 library. Let’s checkout components used in the script –
BeautifulSoup– It is a powerful Python library for pulling out data from HTML/XML files. It creates a parse tree for parsed pages that can be used to extract data from HTML/XML files.
Requests – It is a Python HTTP library. It makes HTTP requests simpler. we just need to add the URL as an argument and the get() gets all the information from it.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning - Basic Level Course
Step 1 – Run the following command to get the stored content from the URL into the response object(file):
Step 2 – Parse HTML content:
Step 3 – Scraping the data from weather site run the following code:
find_all: It is used to pick up all the HTML elements of tag passed in as an argument and its descendants.
find:It will search for the elements of the tag passed.
list.append(dict): This will append all the data to the list of type list.
Step 4 – Convert the list file into CSV file to view the organized weather forecast data.
Use the following code to convert the list into CSV file and store it into
Syntax: pandas.DataFrame(data=None, index: Optional[Collection] = None, columns: Optional[Collection] = None, dtype: Union[str, numpy.dtype, ExtensionDtype, None] = None, copy: bool = False)
data: Dict can contain Series, arrays, constants, or list-like objects.
index : It is used for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided.
columns: column labels to use for resulting frame. Will default to RangeIndex (0, 1, 2, …, n) if no column labels are provided.
dtype: It is used to set the Default value.
copy: It copy the data from input. default value is false.