People all over the world turn to search engines to ask trivial questions, vitally important doubts and find an answer almost immediately. Searching is one of the most used actions on the internet. Search engines as an instrument of searching are very popular and frequently used sites on the internet. A search engine is a web-based program that is designed to search and organize the content from the huge database of resources that we call the world wide web. Usually, to access the information, the user needs to know the exact location of the website stored in the database i.e URL of the website.
This job of finding the URL is done by the search engine. When a request is sent to the search engine, it returns the addresses or links to the websites that the user wanted to visit or something relevant to the requested query. When the search engine finds relevant content in the article there is would be millions of sites and the results given out depends on the search the user is using and the algorithms used by those search engines to give the best results.
How it All Started?
Internet technology had bought a quantum leap in the transfer of information from different places within no time. The Internet had bought a revolution in the way people live. Today we can’t imagine a day without the internet. When the internet technology was made available to the public, which was primarily started as a defense network project, there were only 2800 websites. There were only basic search engines like ARCHIE in 1994 which can get the user to a specified website. Here the user had to know the domain name of the website that the user wants to visit and sends a request to the DNS request server that finds the associated IP address which is used to connect to the website server that the user requested.
But by the end of 1995, there were about a hundred thousand websites and it is very much impossible to know the domain names of every website. We tend to forget that the internet didn’t magically bring with it the ability to find anything in this giant computer network until some clever people decided that information on the internet would be much more useful if it is readily available. This is when the concept of search engines was born.
After 1995, many developments and changes in the world of search engines occurred. Even the websites started designing their pages by adding useful and relevant content according to the specific search engine. So search engines started to look at websites using bot programs called web crawlers or spiders. These are web-programs that visit and look in the site’s content, images, other pages they are linked to and indexes those pages with the links. These were like more of an indexed dictionary but not a real search engine that could deliver the best results. Also, some companies like Yahoo categorized sites manually how they want and lost popularity as they couldn’t keep up with the pace of the other search engines. If any misspelling then there would be vastly different results and might end up in spam sites.
How Search Engines Work?
Modern-day search engines work much more differently than the ones in the past. They use advanced search algorithms to optimize the result and deliver the best results to their users. When the user does a query on the search engine, the search engine does not actually run a program to crawl the whole worldwide web in real-time. The reason for not doing this is simply because of the fact that there are currently over a billion websites listed on the internet and 380 are added every minute. If the search engine goes around looking at every single site to find the one the user wanted to visit it would take forever.
Hence to make the search faster, search engines are constantly scanning the websites in advance to store the information that might help with the user’s search later. It works by the fact that the internet is a web of pages connected by hyperlinks. With the help of these links, crawlers can jump from one site to another with ease and from this one to the next and the process iterates till a whole bunch of them are visited and marked. To deliver the results lightning fast to the users, search engines constantly run bot programs like spiders that crawl through these websites to collect the information that it thinks is important to the users. Each time it comes across a hyperlink, it follows and stores the information even from that site and continues this until it has visited every page it can find from that initial web page. For each site it visits, it records any information it needs for a search by adding it to a special database called search index which contains all the information about a website that is used for getting the search results. So when the user searches anything, it already knows its answer in that index.
Modern search engines have become very much advanced these days that they don’t just search in the words that we type in the search bar, but also understands more than just strings. They understand what those words mean to find the best match that the user is looking for. To understand the words inputted by the user in a much better way, these modern search engines use machine learning, a type of Artificial Intelligence. This enables algorithms to search not only individual keywords but also understand the underlying meaning of these words. Hence achieving the goal of delivering information to the user by just a few keystrokes.
Addressing a Problem: Also, even this method has another major problem when a search is performed the search engine looks at each of those words in the search index to immediately get the list of all pages on the internet relevant to those words. But this method could return some millions of pages that contain those keywords. Hence a big problem arises where the search engine needs to determine the best-matched result to the user. In other words, the search engine needs to rank its results which leads to the development of the page ranking algorithm.
What is Page Ranking?
Page ranking is a ranking algorithm that rates the importance of a website based on what the algorithm thinks the user wants the most. It was named after Larry Page, one of the founders of Google. This algorithm is the core of Google’s search engine. But only Google but many other search engines like Bing by Microsoft also have developed their own different algorithms to rank pages. Google pioneered this algorithm for choosing the most relevant results for a search by taking into account how many other websites are linked to a given page and how important those other pages are. The basic idea was that if lots of websites link back to one single site then that website might be the one the user might be looking for and that is the most probable one that the user wants in their search results. Also, the algorithm checks based on some other factors like relevance, authenticity, credibility, spam before determining importance.
It was first started by Larry Page in his university for determining how important a research paper is. if more other papers cited a paper then that paper has higher importance and that is one he is looking for. Page ranking works in a similar fashion by counting the number and quality of links directing to a page to roughly determine the importance of a website, assuming that more important websites are likely to receive more links from external websites. In Layman’s terms page rank is the vote of all other websites about how important a website is. If a website links to another external site then it is simply voting for that website and a website receiving links then it is being voted from others.
In technical terms, Page Rank(PR) is an analysis algorithm that assigns a numerical weighting to each element of hyperlinked documents on the web with the sole purpose of measuring the relative importance. This numerical weight is referred to as PR or Page rank. This represents the likelihood of a user randomly clicking on links. Having the highest page ranking, a website will be the first result to be shown in case of a relevant search done by the user.
Example: Let us assume in a situation where a user searches the book “A Brief History of Time” by Stephen Hawking. Consider the following websites like “amazon.com”, “astroboy.com”, “booky.com”, “originalreviewer.com”.
Let us assume the book the user wanted to buy was reviewed by many websites that routinely review books and suggest their users the best place to buy that book at the best price like “amazon.com”. If other articles and blogs of famous reviewers recommend “amazon.com” by dropping a link for that on their website. Also, it is indirectly linked by other websites as they are linked to the websites of these review websites. Since many websites link to “amazon.com” it’s weight or PageRank increases making it more important and relevant to the user. Hence when the user searches the book in the search engine the first result that is most likely to come up will be “amazon.com”.
Search engines have become an integral part of modern society. People always use search engines to get answers to their queries. On average search engines like Google process over 40 thousand requests per second which equate to 3.5 billion searches every day. We have remarkably grown from searching a directory for keywords to a level where we can get the address of a nearby restaurant without even specifying the location of the user as the search engine already collects personal data to deliver the results faster and accurately.
Search engines offer its users vast and impressive amounts of information available with a speed and convenience a few people could not have even imagined a few years ago. Search engine algorithms are updated timely in order to improve the speed of delivery and accuracy of results. Search engines stand out as the most widely used websites all around the world and companies owning these are making billions each year. But all these search engines are serving people a great deal yet people know a little or nothing about how they operate and how complex they are. indeed they don’t need to.