Create GitHub API to fetch user profile image and number of repositories using Python and Flask
Last Updated :
29 Nov, 2021
GitHub is where developers shape the future of software, together, contribute to the open-source community, manage Git repositories, etc. It is one of the most used tools by a developer and its profile is shared to showcase or let others contribute to its projects. Web Scraping using python is also one of the best methods to get data.
In this article, we will create an API to fetch a user’s profile image and its followers. Following is the flow in which this blog would guide to create an API:
- Setting up the App Directory
- Web Scrape data from GitHub.
- Beautiful Soup in Python would be used.
- Create an API.
Setting up the App Directory
Step 1: Create a folder (eg. GitHubGFG).
Step 2: Set up the virtual environment. Here we create an environment .env
python -m venv .env
Step 3: Activate the environment.
.env\Scripts\activate
Scraping the Data
Step 1: In Python, we have Beautiful Soup which is a library to pull out data from HTML files. To install Beautiful Soup, run a simple command;
pip install beautifulsoup4
Step 2: Install the Requests module of Python. Requests allows to send HTTP/1.1 requests extremely easily.
pip install requests
Create a python file. (eg: github.py)
Step 3: Following are the steps for Scraping data from the Web Page. To get the HTML text from the web page;
github_html = requests.get(f'https://github.com/{username}').text
The {username} will have the GitHub username of the required user. To represent the parsed object as a whole we use the BeautifulSoup object,
soup = BeautifulSoup(github_html, "html.parser")
Example:
Python3
from bs4 import BeautifulSoup
import requests
username = "kothawleprem"
soup = BeautifulSoup(github_html, "html.parser" )
print (soup)
|
Output:
Now find the avatar class in the HTML document as it has the required URL for the profile image.
find_all(): The find_all() method looks through a tag’s descendants and retrieves all descendants that match the filters. Here our filter is an img tag with the class as avatar.
Python3
avatar_block = soup.find_all( 'img' , class_ = 'avatar' )
print (avatar_block)
|
Following is the output of avatar_block:
The image URL is in the src attribute, to get the URL text use .get():
Python3
img_url = avatar_block[ 4 ].get( 'src' )
print (img_url)
|
Following is the output of img_url:
Find the first Counter class in the HTML document as it has the required data for the number of repositories.
find(): The find() method looks through a tag’s descendants and retrieves a single descendant that matches the filters. Here our filter is a span tag with the class as Counter.
repos = soup.find('span',class_="Counter").text
The entire code would be as follows:
Python3
from bs4 import BeautifulSoup
import requests
username = "kothawleprem"
soup = BeautifulSoup(github_html, "html.parser" )
avatar_block = soup.find_all( 'img' , class_ = 'avatar' )
img_url = avatar_block[ 4 ].get( 'src' )
repos = soup.find( 'span' , class_ = "Counter" ).text
print (img_url)
print (repos)
|
Output:
https://avatars.githubusercontent.com/u/59017652?v=4
33
Creating the API
We will use Flask which is a micro web framework written in Python.
pip install Flask
Following is the starter code for our flask application.
Python3
from flask import Flask
app = Flask(__name__)
@app .route( '/' )
def github():
return "Welcome to GitHubGFG!"
if __name__ = = "__main__" :
app.run(debug = True )
|
Open localhost on your browser:
Getting the GitHub username from the URL:
Python3
from flask import Flask
app = Flask(__name__)
@app .route( '/<username>' )
def github(username):
return f "Username: {username}"
if __name__ = = "__main__" :
app.run(debug = True )
|
Output:
We would now add our code of Web Scrapping and some helper methods provided by Flask to properly return JSON data. jsonify is a function in Flask. It serializes data to JavaScript Object Notation (JSON) format. Consider the following code:
Python3
import requests
from bs4 import BeautifulSoup
from flask import Flask
app = Flask(__name__)
@app .route( '/<username>' )
def github(username):
soup = BeautifulSoup(github_html, "html.parser" )
avatar_block = soup.find_all( 'img' , class_ = 'avatar' )
img_url = avatar_block[ 4 ].get( 'src' )
repos = soup.find( 'span' , class_ = "Counter" ).text
result = {
'imgUrl' : img_url,
'numRepos' : repos,
}
return result
if __name__ = = "__main__" :
app.run(debug = True )
|
Output:
If the username is not correct or for any other reason, we need to add our code in the try and except block to handle exceptions. The final code would be as follows:
Python3
import requests
from bs4 import BeautifulSoup
from flask import Flask
app = Flask(__name__)
@app .route( '/<username>' )
def github(username):
try :
soup = BeautifulSoup(github_html, "html.parser" )
avatar_block = soup.find_all( 'img' , class_ = 'avatar' )
img_url = avatar_block[ 4 ].get( 'src' )
repos = soup.find( 'span' , class_ = "Counter" ).text
result = {
'imgUrl' : img_url,
'numRepos' : repos,
}
except :
result = {
"message" : "Invalid Username!"
}, 400
return result
if __name__ = = "__main__" :
app.run(debug = True )
|
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...