Scrapping Weather prediction Data using Python and BS4

This article revolves around scrapping weather prediction d data using python and bs4 library. Let’s checkout components used in the script –

BeautifulSoup– It is a powerful Python library for pulling out data from HTML/XML files. It creates a parse tree for parsed pages that can be used to extract data from HTML/XML files.
Requests – It is a Python HTTP library. It makes HTTP requests simpler. we just need to add the URL as an argument and the get() gets all the information from it.

We will be scrapping data from https://weather.com/en-IN/weather/tenday/l/INKA0344:1:IN.

Step 1 – Run the following command to get the stored content from the URL into the response object(file):

filter_none

edit
close

play_arrow

link
brightness_4
code

import requests
# to get data from website

chevron_right


 
Step 2 – Parse HTML content:



filter_none

edit
close

play_arrow

link
brightness_4
code

# import Beautifulsoup for scraping the data 
from bs4 import BeautifulSoup
soup = BeautifulSoup(file.content, "html.parser")

chevron_right


 
Step 3 – Scraping the data from weather site run the following code:

filter_none

edit
close

play_arrow

link
brightness_4
code

# create empty list
list =[]
all = soup.find("div", {"class":"locations-title ten-day-page-title"}).find("h1").text
   
# find all table with class-"twc-table"
content = soup.find_all("table", {"class":"twc-table"})
for items in content:
    for i in range(len(items.find_all("tr"))-1):
                # create empty dictionary
        dict = {}
        try:   
                        # assign value to given key 
  
            dict["day"]= items.find_all("span", {"class":"date-time"})[i].text
            dict["date"]= items.find_all("span", {"class":"day-detail"})[i].text            
            dict["desc"]= items.find_all("td", {"class":"description"})[i].text
            dict["temp"]= items.find_all("td", {"class":"temp"})[i].text
            dict["precip"]= items.find_all("td", {"class":"precip"})[i].text
            dict["wind"]= items.find_all("td", {"class":"wind"})[i].text
            dict["humidity"]= items.find_all("td", {"class":"humidity"})[i].text
        except:  
                     # assign None values if no items are there with specified class
  
            dict["day"]="None"
            dict["date"]="None"
            dict["desc"]="None"
            dict["temp"]="None"
            dict["precip"]="None"
            dict["wind"]="None"
            dict["humidity"]="None"
  
        # append dictionary values to the list
        list.append(dict)

chevron_right


find_all: It is used to pick up all the HTML elements of tag passed in as an argument and its descendants.
find:It will search for the elements of the tag passed.
list.append(dict): This will append all the data to the list of type list.

 

Step 4 – Convert the list file into CSV file to view the organized weather forecast data.

Use the following code to convert the list into CSV file and store it into output.csv file:

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
convert = pd.DataFrame(list)
convert.to_csv("output.csv")

chevron_right


.

Syntax: pandas.DataFrame(data=None, index: Optional[Collection] = None, columns: Optional[Collection] = None, dtype: Union[str, numpy.dtype, ExtensionDtype, None] = None, copy: bool = False)

Parameters:

data: Dict can contain Series, arrays, constants, or list-like objects.
index : It is used for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided.
columns: column labels to use for resulting frame. Will default to RangeIndex (0, 1, 2, …, n) if no column labels are provided.
dtype: It is used to set the Default value.
copy: It copy the data from input. default value is false.

filter_none

edit
close

play_arrow

link
brightness_4
code

# read csv file using pandas
a = pd.read_csv("output.csv")
print(a)

chevron_right


Output :




My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.