Open In App

Shell Script to traverse all internal URLs and reporting any errors in the “traverse.errors” file

Last Updated : 28 Feb, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

If you are using a web server or are responsible for a website, either simple or complex, you probably find yourself doing certain tasks with high frequency, significantly identifying broken internal and external site links. Using shell scripts, you can create many of these tasks, as well as other normal clients/server functions such as managing access information to the password-protected website index. The Below Shell script is used to traverse all internal URLs on the given Web site, reporting errors (if any) in the “traverse.errors” file.

Usage: traverse.sh <URL LINK>

lynx="/usr/local/bin/lynx"

trap "$(which rm) -f traverse.dat traverse2.dat" 0
if [ -z "$1" ] ; then
  echo "Usage: checklinks URL" >&2
  exit 1
fi
baseurl="$(echo $1 | cut -d/ -f3 | sed 's/http:\/\///')"
lynx -traversal -accept_all_cookies -realm "$1" > /dev/null
if [ -s "traverse.errors" ] ; then
  /bin/echo -n $(wc -l < traverse.errors) errors encountered.
  echo Checked $(grep '^http' traverse.dat | wc -l) pages at ${1}:
  sed "s|$1||g" < traverse.errors
  mv traverse.errors ${baseurl}.errors
  echo "A copy of this output has been saved in ${baseurl}.errors"
else
  /bin/echo -n "No errors encountered. ";
  echo Checked $(grep '^http' traverse.dat | wc -l) pages at ${1}
fi
if [ -s "reject.dat" ]; then
  mv reject.dat ${baseurl}.rejects
fi
exit 0

Scenario 1: No Errors 

Fig 1.2 – No Errors

Scenario 2: Some Errors

Fig 1.3  5 errors encountered


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads