Skip to content
Related Articles

Related Articles

Improve Article
Save Article
Like Article

Create GitHub API to fetch user profile image and number of repositories using Python and Flask

  • Last Updated : 29 Nov, 2021

GitHub is where developers shape the future of software, together, contribute to the open-source community, manage Git repositories, etc. It is one of the most used tools by a developer and its profile is shared to showcase or let others contribute to its projects. Web Scraping using python is also one of the best methods to get data.

In this article, we will create an API to fetch a user’s profile image and its followers. Following is the flow in which this blog would guide to create an API:

  • Setting up the App Directory
  • Web Scrape data from GitHub.
    • Beautiful Soup in Python would be used.
  • Create an API.
    • Flask would be used.

Setting up the App Directory

Step 1: Create a folder (eg. GitHubGFG).

Step 2: Set up the virtual environment. Here we create an environment .env

python -m venv .env

Step 3: Activate the environment.

.env\Scripts\activate

Scraping the Data

Step 1: In Python, we have Beautiful Soup which is a library to pull out data from HTML files. To install Beautiful Soup, run a simple command;

pip install beautifulsoup4

Step 2: Install the Requests module of Python. Requests allows to send HTTP/1.1 requests extremely easily.

pip install requests

Create a python file. (eg: github.py)

Step 3: Following are the steps for Scraping data from the Web Page. To get the HTML text from the web page;

github_html = requests.get(f'https://github.com/{username}').text

The {username} will have the GitHub username of the required user. To represent the parsed object as a whole we use the BeautifulSoup object,

soup = BeautifulSoup(github_html, "html.parser")

Example:

Python3




from bs4 import BeautifulSoup
import requests
 
username = "kothawleprem"
 
github_html = requests.get(f'https://github.com/{username}').text
soup = BeautifulSoup(github_html, "html.parser")
print(soup)

Output:

Now find the avatar class in the HTML document as it has the required URL for the profile image.

find_all(): The find_all() method looks through a tag’s descendants and retrieves all descendants that match the filters. Here our filter is an img tag with the class as avatar.

Python3




avatar_block = soup.find_all('img',class_='avatar')
print(avatar_block)

Following is the output of avatar_block:

The image URL is in the src attribute, to get the URL text use .get():

Python3




img_url = avatar_block[4].get('src')
print(img_url)

Following is the output of img_url:

Find the first Counter class in the HTML document as it has the required data for the number of repositories.

find(): The find() method looks through a tag’s descendants and retrieves a single descendant that matches the filters. Here our filter is a span tag with the class as Counter.

repos = soup.find('span',class_="Counter").text

The entire code would be as follows:

Python3




from bs4 import BeautifulSoup
import requests
 
username = "kothawleprem"
 
github_html = requests.get(f'https://github.com/{username}').text
soup = BeautifulSoup(github_html, "html.parser")
avatar_block = soup.find_all('img',class_='avatar')
img_url = avatar_block[4].get('src')
repos = soup.find('span',class_="Counter").text
 
print(img_url)
print(repos)

Output:

https://avatars.githubusercontent.com/u/59017652?v=4
33

Creating the API

We will use Flask which is a micro web framework written in Python.

pip install Flask

Following is the starter code for our flask application.

Python3




# We import the Flask Class, an instance of
# this class will be our WSGI application.
from flask import Flask
 
# We create an instance of this class. The first
# argument is the name of the application’s module
# or package.
# __name__ is a convenient shortcut for
# this that is appropriate for most cases.This is
# needed so that Flask knows where to look for resources
# such as templates and static files.
app = Flask(__name__)
 
# We use the route() decorator to tell Flask what URL
# should trigger our function.
@app.route('/')
def github():
    return "Welcome to GitHubGFG!"
 
# main driver function
if __name__ == "__main__":
   
    # run() method of Flask class runs the
    # application on the local development server.
    app.run(debug=True)

Open localhost on your browser: 

Getting the GitHub username from the URL:

Python3




from flask import Flask
 
app = Flask(__name__)
 
@app.route('/<username>')
def github(username):
    return f"Username: {username}"
 
if __name__ == "__main__":
    app.run(debug=True)

Output: 

We would now add our code of Web Scrapping and some helper methods provided by Flask to properly return JSON data. jsonify is a function in Flask. It serializes data to JavaScript Object Notation (JSON) format. Consider the following code:

Python3




import requests
from bs4 import BeautifulSoup
from flask import Flask
 
app = Flask(__name__)
 
@app.route('/<username>')
def github(username):
    github_html = requests.get(f'https://github.com/{username}').text
    soup = BeautifulSoup(github_html, "html.parser")
    avatar_block = soup.find_all('img',class_='avatar')
    img_url = avatar_block[4].get('src')
    repos = soup.find('span',class_="Counter").text
     
    # Creating a dictionary for our data
    result = {
        'imgUrl' : img_url,
        'numRepos' : repos,
    }
    return result
 
if __name__ == "__main__":
    app.run(debug=True)

Output:

If the username is not correct or for any other reason, we need to add our code in the try and except block to handle exceptions. The final code would be as follows:

Python3




import requests
from bs4 import BeautifulSoup
from flask import Flask
 
app = Flask(__name__)
 
@app.route('/<username>')
def github(username):
    try:
        github_html = requests.get(f'https://github.com/{username}').text
        soup = BeautifulSoup(github_html, "html.parser")
        avatar_block = soup.find_all('img',class_='avatar')
        img_url = avatar_block[4].get('src')
        repos = soup.find('span',class_="Counter").text
        # Creating a dictionary for our data
        result = {
            'imgUrl' : img_url,
            'numRepos' : repos,
        }
    except:
        result = {
            "message": "Invalid Username!"
        }, 400
    return result
 
if __name__ == "__main__":
    app.run(debug=True)


My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!