Open In App
Related Articles

Fetch top 10 starred repositories of user on GitHub | Python

Improve Article
Save Article
Like Article

Prerequisites: Basic understanding of python, urllib2 and BeautifulSoup 

We often write python scripts to make our task easier, so here is the script which helps you to fetch top 10 starred repositories of any user on GitHub.
You just need Github username (For example: msdeep14) to run the script.
Script Explanation: 

  1. First access the repository url of user, for example: username = “msdeep14”, then url = “”
  2. Now scrape the url page and fetch stars, repository name and repository url using BeautifulSoup.
  3. On one page there are 30 repositories, so if the user has more than 30 repositories, you need a loop to access all the pages.
  4. Use urllib2 or BeautifulSoup to scrape the page, The code uses both, see the code below.



# Python3 script to fetch top 10 starred
# repositories of a user on github
import urllib.request, urllib.parse, urllib.error
import urllib.request, urllib.error, urllib.parse
import http.cookiejar
import requests
from lxml import html
from lxml import etree
from bs4 import BeautifulSoup
import re
import operator
top_limit = 9
def openWebsite():
    # enter Github username
    # of user
    username = str(input("enter GitHub username: "))
    # Dictionary to store key as repository
    # name and value as no. of stars
    repo_dict = {}
    # This is first page url where user
    # repositories are located
    url = ""+username+"?tab=repositories"
    # loop for all the pages
    while True:
            You can read the docs of urllib2 and
            BeautifulSoup to see how html page
            can be scraped to extract data
            urllib2  :
            BeautifulSoup :
        # open the website and get
        # the html of webpage into doc
        cj = http.cookiejar.CookieJar()
        opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
        resp =
        doc = html.fromstring(
        # extract all the repository names
        repo_name = doc.xpath('//li[@class="col-12 d-block width-full py-4 border-bottom public source"]/div[@class="d-inline-block mb-1"]/h3/a/text()')
        # list to store repository names
        repo_list = []
        # get the repository name
        for name in repo_name:
            name = ' '.join(''.join(name).split())
            repo_dict[name] = 0
        # print repo_list
        response = requests.get(url)
        soup = BeautifulSoup(response.text, 'html.parser')
            The path mentioned to get the no. of
            stargazers, you can get it by right
            click on star symbol on Github page,
            and then select inspect element
        soup = BeautifulSoup(response.text, 'html.parser')
        div = soup.find_all('li', {'class': 'col-12 d-block width-full py-4 border-bottom public source'})
        for d in div:
            temp = d.find_all('div',{'class':'f6 text-gray mt-2'})
            for t in temp:
                # Get the no. of stars of
                # particular repository
                x = t.find_all('a', attrs={'href': re.compile("^\/[a-zA-Z0-9\-\_\.]+\/[a-zA-Z0-9\.\-\_]+\/stargazers")})
                # Get the url of the repository
                # and populate the values of dictionary
                # with no. of stars
                if len(x) is not 0:
                    name = x[0].get('href')
                    name = name[len(username)+2:-11]
                    repo_dict[name] = int(x[0].text)
        # Check if next page exists
        # for more repositories
        div = soup.find('a',{'class':'next_page'})
        # print div
        if div is not None:
            url = div.get('href')
            url = ""+url
            # if there is no next repository
            # page, then exit loop
    # Get the sorted list of all
    # repos and print top 10
    i = 0
    sorted_repo = sorted(iter(repo_dict.items()), key = operator.itemgetter(1))
    # Print the sorted repos in
    # reverse order
    for val in reversed(sorted_repo):
        repo_url = "" + username + "/" + val[0]
        print("\nrepo name : ",val[0], "\nrepo url  : ",repo_url, "\nstars     : ",val[1])
        i = i + 1
        if i > top_limit:
# Driver program
if __name__ == "__main__":


enter GitHub username: msdeep14

repo name :  DeepDataBase 
repo url  : 
stars     :  13

repo name :  MiniDataBase 
repo url  : 
stars     :  8

repo name :  hackerranksolutions 
repo url  : 
stars     :  6

repo name :  stayUpdated 
repo url  : 
stars     :  6

repo name :  IRCTC 
repo url  : 
stars     :  4

repo name :  play_2048 
repo url  : 
stars     :  3

repo name :  Tripcount 
repo url  : 
stars     :  3

repo name :  SnapLook 
repo url  : 
stars     :  2

repo name :  fbFun 
repo url  : 
stars     :  2

repo name :  ByteCode 
repo url  : 
stars     :  2

Video Tutorial 

Complete repository link : trackGitHubStars

Whether you're preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape, GeeksforGeeks Courses are your key to success. We provide top-quality content at affordable prices, all geared towards accelerating your growth in a time-bound manner. Join the millions we've already empowered, and we're here to do the same for you. Don't miss out - check it out now!

Last Updated : 17 Jun, 2021
Like Article
Save Article
Similar Reads
Complete Tutorials