Open In App

Difference between Web Content, Web Structure, and Web Usage Mining

Improve
Improve
Like Article
Like
Save
Share
Report

Web mining is an application of the Data Mining technique that is used to find information patterns from the web data. Web Mining helps to improve the power of web search engines by identifying the web pages and classifying web documents.

Types of Web Mining :

1. Web Content Mining –
Web Content Mining can be used for the mining of useful data, information, and knowledge from web page content. Web content mining performs scanning and mining of the text, images, and group of web pages according to the content of the input, by displaying the list in search engines.

There are two approaches that are used for Web Content Mining :

  • (i) Agent-based approach :
    This approach involves intelligent systems. It usually relies on autonomous agents, that can identify websites that are relevant.

  • (ii) Data-based approach :
    Data-Based approach is used to organize semi-structured data present on the internet into structured data.

2. Web Structure Mining –
Web Structure Mining can be used to discover link structure of hyperlinks. The purpose of Structure Mining is to produce the structural summary of websites and similar web pages. Interested in the structure of hyperlinks within the web. This type of mining is applied at the level of document and at hyperlink level. Web Structure Mining plays a very important role in the mining process.

3. Web Usage Mining –
Web Usage Mining is used for mining weblog records (access information of web pages). It helps to discover user access patterns of web pages. There are many available research projects and tools that analyze those patterns for different purposes. There are mainly four techniques of mining applied to web mining namely, Association Rule Mining, Sequential Pattern, Clustering, and Classification.



Difference Between Web Content, Web Structure, and Web Usage Mining :

Criterion Web Content Web Structure Web Usage
IR VIEW DB VIEW
View of data
  • Unstructured
  • Structured
  • Semi-structured
  • Website as DB
  • Link structure
  • Interactivity
Main data
  • Text documents
  • Hypertext documents
Hypertext documents Link structure
  • Server logs
  • Browser logs
Method
  • Machine Learning
  • Statistical (Including NLP)
  • Proprietary algorithm
  • Association rules
Proprietary algorithm
  • Machine learning
  • Statistical
  • Association Rules
Representation
  • Bag of words, n-gram terms
  • Phrases, concepts or ontology
  • Relational
  • Edged labeled graph
  • Relational
Graph
  • Relational Table
  • Graph
Application Categories
  • Categorization
  • Clustering
  • Finding Extract rules
  • Finding Patterns in text
  • Finding frequent sub structures
  • Web site schema discovery
  • Categorization
  • Clustering
  • Site construction
  • Adaptation and management


Last Updated : 06 Jul, 2020
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads