Skip to content
Related Articles

Related Articles

Extract Data From JustDial using Selenium
  • Last Updated : 04 Jan, 2021

Let us see how to extract data from Justdial using Selenium and Python. Justdial is a company that provides local search for different services in India over the phone, website and mobile apps. In this article we will be extracting the following data:

  • Phone number
  • Name
  • Address

We can then save the data in a CSV file.

Approach:

  1. Import the following modules: webdriver from selenium, ChromeDriverManager, pandas, time and os.
  2. Use the driver.get() method and pass the link you want to get information from.
  3. Use the driver.find_elements_by_class_name() method and pass ‘store-details’.
  4. Instantiate empty lists to store the values.
  5. Iterate the StoreDetails and start fetching the individual details that are required.
  6. Create a user-defined function strings_to_number() to convert the extracted string to numbers.
  7. Display the details and save them as a CSV file according to the requirements.

Python3




# importing the modules
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(ChromeDriverManager().install())
import pandas as pd
import time
import os
  
# driver.get method() will navigate to a page given by the URL address
  
# the user-defined function
def strings_to_num(argument): 
      
    switcher =
        'dc': '+',
        'fe': '(',
        'hg': ')',
        'ba': '-',
        'acb': '0'
        'yz': '1'
        'wx': '2',
        'vu': '3',
        'ts': '4',
        'rq': '5',
        'po': '6',
        'nm': '7',
        'lk': '8',
        'ji': '9'
    
    return switcher.get(argument, "nothing")
  
# fetching all the store details
storeDetails = driver.find_elements_by_class_name('store-details')
  
# instatiating empty lists
nameList = []
addressList = []
numbersList = []
  
# iterating the storeDetails
for i in range(len(storeDetails)):
      
    # fetching the name, address and contact for each entry
    name = storeDetails[i].find_element_by_class_name('lng_cont_name').text
    address = storeDetails[i].find_element_by_class_name('cont_sw_addr').text
    contactList = storeDetails[i].find_elements_by_class_name('mobilesv')
      
    myList = []
      
    for j in range(len(contactList)):
          
        myString = contactList[j].get_attribute('class').split("-")[1]
      
        myList.append(strings_to_num(myString))
  
    nameList.append(name)
    addressList.append(address)
    numbersList.append("".join(myList))
      
# intialise data of lists.
data = {'Company Name': nameList,
        'Address': addressList,
        'Phone': numbersList}
  
# Create DataFrame
df = pd.DataFrame(data)
print(df)
  
# Save Data as .csv
df.to_csv('demo1.csv', mode = 'a', header = False)

Output:

 
 
 
 

Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.




My Personal Notes arrow_drop_up
Recommended Articles
Page :