Open In App
Related Articles

Difference between Web Content, Web Structure, and Web Usage Mining

Like Article
Save Article
Report issue
Web mining is an application of the Data Mining technique that is used to find information patterns from the web data. Web Mining helps to improve the power of web search engines by identifying the web pages and classifying web documents. Types of Web Mining : 1. Web Content Mining – Web Content Mining can be used for the mining of useful data, information, and knowledge from web page content. Web content mining performs scanning and mining of the text, images, and group of web pages according to the content of the input, by displaying the list in search engines. There are two approaches that are used for Web Content Mining :
  • (i) Agent-based approach : This approach involves intelligent systems. It usually relies on autonomous agents, that can identify websites that are relevant.
  • (ii) Data-based approach : Data-Based approach is used to organize semi-structured data present on the internet into structured data.
2. Web Structure Mining – Web Structure Mining can be used to discover link structure of hyperlinks. The purpose of Structure Mining is to produce the structural summary of websites and similar web pages. Interested in the structure of hyperlinks within the web. This type of mining is applied at the level of document and at hyperlink level. Web Structure Mining plays a very important role in the mining process. 3. Web Usage Mining – Web Usage Mining is used for mining weblog records (access information of web pages). It helps to discover user access patterns of web pages. There are many available research projects and tools that analyze those patterns for different purposes. There are mainly four techniques of mining applied to web mining namely, Association Rule Mining, Sequential Pattern, Clustering, and Classification.
Difference Between Web Content, Web Structure, and Web Usage Mining :
Criterion Web Content Web Structure Web Usage
View of data
  • Unstructured
  • Structured
  • Semi-structured
  • Website as DB
  • Link structure
  • Interactivity
Main data
  • Text documents
  • Hypertext documents
Hypertext documents Link structure
  • Server logs
  • Browser logs
  • Machine Learning
  • Statistical (Including NLP)
  • Proprietary algorithm
  • Association rules
Proprietary algorithm
  • Machine learning
  • Statistical
  • Association Rules
  • Bag of words, n-gram terms
  • Phrases, concepts or ontology
  • Relational
  • Edged labeled graph
  • Relational
  • Relational Table
  • Graph
Application Categories
  • Categorization
  • Clustering
  • Finding Extract rules
  • Finding Patterns in text
  • Finding frequent sub structures
  • Web site schema discovery
  • Categorization
  • Clustering
  • Site construction
  • Adaptation and management

Unlock the Power of Placement Preparation!
Feeling lost in OS, DBMS, CN, SQL, and DSA chaos? Our Complete Interview Preparation Course is the ultimate guide to conquer placements. Trusted by over 100,000+ geeks, this course is your roadmap to interview triumph.
Ready to dive in? Explore our Free Demo Content and join our Complete Interview Preparation course.

Last Updated : 06 Jul, 2020
Like Article
Save Article
Share your thoughts in the comments
Similar Reads