Python script to monitor website changes
Last Updated :
03 Apr, 2024
In this article, we are going to discuss how to create a python script to monitor website changes. You can code a program to monitor a website and it will notify you if there are any changes. This program has many useful scenarios for example if your school website has updated something you will come to know about it.
Approach:
We will follow the following steps to write this program:
- Read the URL you want to monitor.
- Hash the entire website.
- Wait for a specified amount of seconds.
- If there are any changes as compared to the previous hash notify me else wait and again and then again take the hash.
Libraries required:
Libraries we will be using are:
- time: To wait for a specified amount of time.
- hashlib: To hash the content of the entire website.
- urllib: To perform the get request and load the content of the website.
Implementation:
Python3
import time
import hashlib
from urllib.request import urlopen, Request
url = Request( 'www.geeksforgeeks.org' ,
headers = { 'User-Agent' : 'Mozilla/5.0' })
response = urlopen(url).read()
currentHash = hashlib.sha224(response).hexdigest()
print ( "running" )
time.sleep( 10 )
while True :
try :
response = urlopen(url).read()
currentHash = hashlib.sha224(response).hexdigest()
time.sleep( 30 )
response = urlopen(url).read()
newHash = hashlib.sha224(response).hexdigest()
if newHash = = currentHash:
continue
else :
print ( "something changed" )
response = urlopen(url).read()
currentHash = hashlib.sha224(response).hexdigest()
time.sleep( 30 )
continue
except Exception as e:
print ( "error" )
|
Output:
output
Note: time.sleep() takes seconds as a parameter. You can make changes to notifications instead of printing the status on the terminal you can write a program to get an email.
Code Explanation:
- The code starts by importing the libraries.
- Then it sets up a URL to monitor and performs a GET request on that website.
- The response is then stored in a variable called response.
- Next, the hash of the response is created with the help of hashlib and stored in currentHash.
- Next, time is set to sleep for 10 seconds before continuing while looping through an infinite loop which will continue until something changes or there’s an exception.
- If anything changes, it will be printed out as well as another GET request performed on the website again after 30 seconds has passed without any change happening to either hashes (the first one or second one).
- The code is a Python script that monitors an URL for changes and notifies the user if there is one.
- The code first imports libraries needed to perform the desired task.
- It then sets up the URL to monitor, which will be www.geeksforgeeks.org Next, it performs a GET request on the website and stores it in a variable called response.
- The code then creates an initial hash using sha224(response).hexdigest().
- Next, it sleeps for 10 seconds before iterating through the loop again with urlopen(url).read() being performed every 30 seconds.
- If something changed in the hashes of currentHash and newHash, then print(“something changed”) is done to notify that something has changed.
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...