Open In App

Web Scraping for Stock Prices in Python

Last Updated : 23 Jan, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Web scraping is a data extraction method that collects data only from websites. It is often used for data mining and gathering valuable insights from large websites. Web scraping is also useful for personal use. Python includes a nice library called BeautifulSoup that enables web scraping. In this article, we will extract current stock prices using web scraping and save them in an excel file using Python.

Required Modules

In this article, we’ll look at how to work with the Requests, Beautiful Soup, and Pandas Python packages to consume data from websites. 

  • The Requests module allows you to integrate your Python programs with web services.
  • The Beautiful Soup module is designed to make screen scraping a snap. Using Python’s interactive console and these two libraries, we’ll walk through how to assemble a web page and work with the textual information available on it.
  • The Pandas module is designed to provide high-performance data manipulation in Python. It is used for data analysis that requires lots of processing, such as restructuring, cleaning or merging, etc.

Approach

  • Initially, we are going to import our required libraries.
  • Then we take the URL stored in our list.
  • We will feed the URL to our soup object which will then extract relevant information from the given URL based on the class id we provide it.
  • Store all the data in the Pandas Dataframe and save it to a CSV file.

Step 1: Import Libraries

We import the modules for Pandas, Requests, and Beautiful soup. Add a user agent and a header declaration. This makes sure that the target website for web scraping won’t automatically classify the traffic as spam and end up being blacklisted. Many user agents are available at https://developers.whatismybrowser.com/

Python3




import requests
from bs4 import BeautifulSoup
import pandas as pd
  
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; \
    Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) \
    Chrome/84.0.4147.105 Safari/537.36'}


Step 2: Collect URLs to Scrap

We’ll assign the URL of the required stock web pages, www.investing.com in the list of URLs:

urls = [
    'https://www.investing.com/equities/nike',
    'https://www.investing.com/equities/coca-cola-co',
    'https://www.investing.com/equities/microsoft-corp',
    'https://www.investing.com/equities/3m-co',
    'https://www.investing.com/equities/american-express',
    'https://www.investing.com/equities/amgen-inc',
    'https://www.investing.com/equities/apple-computer-inc',
    'https://www.investing.com/equities/boeing-co',
    'https://www.investing.com/equities/cisco-sys-inc',
    'https://www.investing.com/equities/goldman-sachs-group',
    'https://www.investing.com/equities/ibm',
    'https://www.investing.com/equities/intel-corp',
    'https://www.investing.com/equities/jp-morgan-chase',
    'https://www.investing.com/equities/mcdonalds',
    'https://www.investing.com/equities/salesforce-com',
    'https://www.investing.com/equities/verizon-communications',
    'https://www.investing.com/equities/visa-inc',
    'https://www.investing.com/equities/wal-mart-stores',
    'https://www.investing.com/equities/disney',
    ]

Step 3: Retrieving Element Ids

We identify the element by looking at the rendered web page, but it’s impossible for a script to determine that. To find the target element, get its element ID and enter it into the script. Getting the ID of an item is pretty easy. Let’s say you want the item id for the stock name. All we have to do is go to the URL and see the text in the console. Get the text next to the class=

Extract current stock price using web scraping

 

Let’s iterate through the list of stocks we need and use soup.find() to find the tag with the specified id and print the company, current stock price, change in percentage of stocks, and volume of stocks.

company = soup.find(‘h1’, {‘class’: ‘text-2xl font-semibold instrument-header_title__gCaMF mobile:mb-2’}).text

price = soup.find(‘div’, {‘class’: ‘instrument-price_instrument-price__xfgbB flex items-end flex-wrap font-bold’}).find_all(‘span’)[0].text

change = soup.find(‘div’, {‘class’: ‘instrument-price_instrument-price__xfgbB flex items-end flex-wrap font-bold’}).find_all(‘span’)[2].text

volume=soup.find(‘div’,{‘class’: ‘trading-hours_value__5_NnB’}).text

As we can see the price and change has the same Class Id. So let’s fund all the span tags and use the find_all(‘span’)[tag number] and extract the text.

Extract current stock price using web scraping

 

Step 4:  Try Data Extraction

Basically, during the extraction of data from a web page, we can expect AttributeError (When we try to access the Tag using BeautifulSoup from a website and that tag is not present on that website then BeautifulSoup always gives an AttributeError). To handle this error let’s use Try and except the concept. Also, you can use the code in Google collab as it has all the updated versions.

How does try() work? 

  • First, the try clause is executed i.e. the code between the try and except clause.
  • If there is no exception, then only the try clause will run, except the clause is finished.
  • If any exception occurs, the try clause will be skipped and except clause will run.
  • If any exception occurs, but the except clause within the code doesn’t handle it, it is passed on to the outer try statements. If the exception is left unhandled, then the execution stops.
  • A try statement can have more than one except clause.

When the try block is executed we are going to extract data from the individual stock and store the data in the variables 

  • Company(name of the stock)
  • Price (current price)
  • Change(change of stock value +ve increase or -ve decrease)
  • Volume(stock volume)

We will use a list and store the company name, price of a stock, change in stock, and volume of each stock and store them in a list that consists of the stock data of all individual stocks.

Python3




all=[]
for url in urls:
    page = requests.get(url,headers=headers)
    try:
        soup = BeautifulSoup(page.text, 
                             'html.parser')
        company = soup.find('h1', {'class'
           'text-2xl font-semibold \
           instrument-header_title__gCaMF \
           mobile:mb-2'}).text
        price = soup.find('div', {'class':
            'instrument-price_instrument-price__xfgbB flex \
            items-end flex-wrap font-bold'})
                        .find_all('span')[0].text
        change = soup.find('div', {'class'
            'instrument-price_instrument-price__xfgbB \
            flex items-end flex-wrap font-bold'})
                .find_all('span')[2].text
        volume=soup.find('div',{'class'
                         'trading-hours_value__5_NnB'}).text
        x=[company,price,change,volume]
        all.append(x)
          
    except AttributeError:
      print("Change the Element id")


Step 5: Storing Data to Data Frame 

Let’s declare the column names and using pandas create a Dataframe with columns: Company, Price, Change, and Volume.

Syntax:

Column_names = [list of column names]
dataframe = pd.DataFrame(columns = column_names)

Next, we will iterate through the list and fill each data frame’s each row with the details of each company by using built-in functions loc( ).  The loc() function is label based data selecting method which means that we have to pass the name of the row or column which we want to select. The df.loc[index] = i, assigning the data to that row after that we will update the index in the Data Frame. The reset_index() is used to reset the index of the Data Frame from 0.

Python3




column_names = ["Company", "Price", "Change", "Volume"]
df = pd.DataFrame(columns=column_names)
for i in all:
    index = 0
    df.loc[index] = i
    df.index = df.index + 1
df = df.reset_index(drop=True)
df


Extract current stock price using web scraping

 

Step 6: Save it to Excel

To save the data as a CSV file we can use the built-in Function to_excel.

Python3




df.to_excel('stocks.xlsx')


Complete code

Here is the entire code:

Python3




import requests
from bs4 import BeautifulSoup
import pandas as pd
  
headers = {'user-agent':'Mozilla/5.0 \
            (Windows NT 10.0; Win64; x64) \
            AppleWebKit/537.36 (KHTML, like Gecko) \
            Chrome/84.0.4147.105 Safari/537.36'}
  
urls = [
    ]
  
  
all=[]
for url in urls:
    page = requests.get(url,headers=headers)
    try:
        soup = BeautifulSoup(page.text, 'html.parser')
        company = soup.find('h1', {'class'
          'text-2xl font-semibold \
          instrument-header_title__gCaMF \
          mobile:mb-2'}).text
        price = soup.find('div', {'class'
          'instrument-price_instrument-price__xfgbB flex \
          items-end flex-wrap font-bold'})
            .find_all('span')[0].text
        change = soup.find('div', {'class'
           'instrument-price_instrument-price__xfgbB\
               flex items-end flex-wrap font-bold'})
                .find_all('span')[2].text
        volume=soup.find('div',{'class'
                   'trading-hours_value__5_NnB'}).text
        x=[company,price,change,volume]
        all.append(x)
          
    except AttributeError:
      print("Change the Element id")
  
column_names = ["Company", "Price", "Change","Volume"]
df = pd.DataFrame(columns = column_names)
for i in all:
  index=0
  df.loc[index] = i
  df.index = df.index + 1
df=df.reset_index(drop=True)
df.to_excel('stocks.xlsx')


Output:

Extract current stock price using web scraping

 



Similar Reads

How to Download historical stock prices in Python ?
Stock prices refer to the current price of the share of that stock. Stock prices are widely used in the field of Machine Learning for the demonstration of the regression problem. Stock prediction is an application of Machine learning where we predict the stocks of a particular firm by looking at its past data. Now to build something like this first
3 min read
Implementing web scraping using lxml in Python
Web scraping basically refers to fetching only some important piece of information from one or more websites. Every website has recognizable structure/pattern of HTML elements. Steps to perform web scraping :1. Send a link and get the response from the sent link 2. Then convert response object to a byte string. 3. Pass the byte string to 'fromstrin
3 min read
Implementing Web Scraping in Python with Scrapy
Nowadays data is everything and if someone wants to get data from webpages then one way to use an API or implement Web Scraping techniques. In Python, Web scraping can be done easily by using scraping tools like BeautifulSoup. But what if the user is concerned about performance of scraper or need to scrape data efficiently. To overcome this problem
5 min read
Web Scraping CryptoCurrency price and storing it in MongoDB using Python
Let us see how to fetch history price in USD or BTC, traded volume and market cap for a given date range using Santiment API and storing the data into MongoDB collection. Python is a mature language and getting much used in the Cryptocurrency domain. MongoDB is a NoSQL database getting paired with Python in many projects which helps to hold details
4 min read
Increase the speed of Web Scraping in Python using HTTPX module
In this article, we will talk about how to speed up web scraping using the requests module with the help of the HTTPX module and AsyncIO by fetching the requests concurrently. The user must be familiar with Python. Knowledge about the Requests module or web scraping would be a bonus. Required Module For this tutorial, we will use 4 modules - timere
4 min read
Web scraping from Wikipedia using Python - A Complete Guide
In this article, you will learn various concepts of web scraping and get comfortable with scraping various types of websites and their data. The goal is to scrape data from the Wikipedia Home page and parse it through various web scraping techniques. You will be getting familiar with various web scraping techniques, python modules for web scraping,
9 min read
Quote Guessing Game using Web Scraping in Python
Prerequisite: BeautifulSoup Installation In this article, we will scrape a quote and details of the author from this site http//quotes.toscrape.com using python framework called BeautifulSoup and develop a guessing game using different data structures and algorithm. The user will be given 4 chances to guess the author of a famous quote, In every ch
3 min read
How to Build Web scraping bot in Python
In this article, we are going to see how to build a web scraping bot in Python. Web Scraping is a process of extracting data from websites. A Bot is a piece of code that will automate our task. Therefore, A web scraping bot is a program that will automatically scrape a website for data, based on our requirements. Module neededbs4: Beautiful Soup(bs
8 min read
Clean Web Scraping Data Using clean-text in Python
If you like to play with API's or like to scrape data from various websites, you must've come around random annoying text, numbers, keywords that come around with data. Sometimes it can be really complicating and frustrating to clean scraped data to obtain the actual data that we want. In this article, we are going to explore a python library calle
2 min read
Web Scraping Financial News Using Python
In this article, we will cover how to extract financial news seamlessly using Python. This financial news helps many traders in placing the trade in cryptocurrency, bitcoins, the stock markets, and many other global stock markets setting up of trading bot will help us to analyze the data. Thus all this can be done with the help of web scraping usin
3 min read
Article Tags :
Practice Tags :