Skip to content
Related Articles

Related Articles

Extracting Code From GeeksForGeeks Article

View Discussion
Improve Article
Save Article
  • Last Updated : 29 Dec, 2020
View Discussion
Improve Article
Save Article

Prerequisite:

Modules Needed

  • requests- Requests allows you to send HTTP/1.1 requests extremely easily. This module also doesn’t come built-in with Python. To install simply type the given command in the terminal.
pip install requests
  • bs4 :- Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this, type the given command in the terminal.
pip install bs4

Approach:

  • Import modules
  • Get the article name as input
  • Initiate a get request to the URL
  • Scrap the code and language name in which it is written using bs4

A lot can be done with this concept and using the given, for example you can directly save each code in separate file with their extension or you can scrap complete article and extract important information like writer details.

Below is the implementation.

Python3




import requests
from bs4 import BeautifulSoup
  
# input  geeks for geeks article
article = 'extract-authors-information-from-geeksforgeeks-article-using-python'
index_Code = 3
  
# url
  
  
# Making a GET request
# to fetch article from
# geeksforgeeks servers
def getdata(url):
    r = requests.get(url)
    return r.text
  
  
def codescrapper(soup, article=None):
    codes_languages = soup.find_all('h2', class_='tabtitle')
    codes = soup.find_all("div", class_='code-container')
    count_codes_language = len(codes_languages)
    print(url)
      
    if article and article <= count_codes_language:
        print(codes[article-1].get_text())
          
    else:
        for x in range(count_codes_language):
            print(codes[x].get_text())
  
  
if __name__ == '__main__':
    
    complete_article_html = getdata(url)
    soup = BeautifulSoup(complete_article_html, 'html.parser')
    codescrapper(soup, index_Code)

Output:


My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!