Extract Data From JustDial using Selenium
Let us see how to extract data from Justdial using Selenium and Python. Justdial is a company that provides local search for different services in India over the phone, website and mobile apps. In this article we will be extracting the following data:
- Phone number
- Name
- Address
We can then save the data in a CSV file.
Approach:
- Import the following modules: webdriver from selenium, ChromeDriverManager, pandas, time and os.
- Use the driver.get() method and pass the link you want to get information from.
- Use the driver.find_elements_by_class_name() method and pass ‘store-details’.
- Instantiate empty lists to store the values.
- Iterate the StoreDetails and start fetching the individual details that are required.
- Create a user-defined function strings_to_number() to convert the extracted string to numbers.
- Display the details and save them as a CSV file according to the requirements.
Python3
# importing the modules from selenium import webdriver from webdriver_manager.chrome import ChromeDriverManager driver = webdriver.Chrome(ChromeDriverManager().install()) import pandas as pd import time import os # driver.get method() will navigate to a page given by the URL address # the user-defined function def strings_to_num(argument): switcher = { 'dc' : '+' , 'fe' : '(' , 'hg' : ')' , 'ba' : '-' , 'acb' : '0' , 'yz' : '1' , 'wx' : '2' , 'vu' : '3' , 'ts' : '4' , 'rq' : '5' , 'po' : '6' , 'nm' : '7' , 'lk' : '8' , 'ji' : '9' } return switcher.get(argument, "nothing" ) # fetching all the store details storeDetails = driver.find_elements_by_class_name( 'store-details' ) # instatiating empty lists nameList = [] addressList = [] numbersList = [] # iterating the storeDetails for i in range ( len (storeDetails)): # fetching the name, address and contact for each entry name = storeDetails[i].find_element_by_class_name( 'lng_cont_name' ).text address = storeDetails[i].find_element_by_class_name( 'cont_sw_addr' ).text contactList = storeDetails[i].find_elements_by_class_name( 'mobilesv' ) myList = [] for j in range ( len (contactList)): myString = contactList[j].get_attribute( 'class' ).split( "-" )[ 1 ] myList.append(strings_to_num(myString)) nameList.append(name) addressList.append(address) numbersList.append("".join(myList)) # intialise data of lists. data = { 'Company Name' : nameList, 'Address' : addressList, 'Phone' : numbersList} # Create DataFrame df = pd.DataFrame(data) print (df) # Save Data as .csv df.to_csv( 'demo1.csv' , mode = 'a' , header = False ) |
Output:
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.