Open In App

Scrape LinkedIn Profiles without login using Python

Last Updated : 31 Jan, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we’ll explore how to scrape LinkedIn profiles without the need for a login, empowering you to gather valuable insights and information programmatically. By leveraging Python’s web scraping capabilities, we can access public LinkedIn profiles seamlessly, opening up new possibilities for data-driven analysis and connection building.”

Scrape LinkedIn Profiles without login using Python

Below, is the step-by-step Implementation of Scrape LinkedIn Profiles without login using Python.

Create a Virtual Environment

First, create the virtual environment using the below commands

python -m venv env 
.\env\Scripts\activate.ps1

Install Necessary Library

First, it’s essential to install the required libraries, namely requests, bs4, and re, to proceed with the subsequent steps.

pip install requests
pip install bs4

Import the Library

First, it’s essential to import the required libraries, namely requests, bs4, and re, to proceed with the subsequent steps.

Python3




import requests
import re
from bs4 import BeautifulSoup


Implement the Logic

In below code , The ‘scrape_linkedin_profiles’ fucntion extracts information from a given LinkedIn profile URL. Using the ‘requests’ library and a predefined ‘User-Agent’ header for guest access, it fetches the HTML content. The script then utilizes ‘BeautifulSoup’ to parse the HTML and extract details like profile name, designation, followers count, and description. Error-checking ensures that information is only extracted if the relevant HTML elements are present. Extracted details are printed, or default messages are displayed if any element is not found. If the request is unsuccessful, an error message with the status code is shown.

Python3




def scrape_linkedin_profiles(url):
    headers = {
        "User-Agent": "Guest"# Access as Guest
    }
 
    response = requests.get(url, headers=headers)
 
    if response.status_code == 200:    # if request granted
        soup = BeautifulSoup(response.content, 'html.parser')
 
        # Extract profile information
        title_tag = soup.find('title')
        designation_tag = soup.find('h2')
        followers_tag = soup.find('meta', {"property": "og:description"})
        description_tag = soup.find('p', class_='break-words')
 
        # Check if the tags are found before calling get_text()
        name = title_tag.get_text(strip=True).split("|")[0].strip() if title_tag else "Profile Name not found"
        designation = designation_tag.get_text(strip=True) if designation_tag else "Designation not found"
 
        # Use regular expression to extract followers and description count
        followers_match = re.search(r'\b(\d[\d,.]*)\s+followers\b', followers_tag["content"]) if followers_tag else None
        followers_count = followers_match.group(1) if followers_match else "Followers count not found"
 
        description = description_tag.get_text(strip=True) if description_tag else "Description not found"
 
        print(f"Profile Name: {name}")
        print(f"Designation: {designation}")
        print(f"Followers Count: {followers_count}")
        print(f"Description: {description}")
    else:
        print(f"Error: Unable to retrieve the LinkedIn company profile. Status code: {response.status_code}")


Create Pipeline For Usage

Now we will define a pipeline to pass the target LinkedIn profile URL (here we will use GeeksforGeeks LinkedIn profile) to the scrapper function.

Python3




# Pipeline
scrape_linkedin_profiles(profile_url)


Output



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads