Different Types of Data in Data Mining
In general terms, “Mining” is the process of extraction. In the context of computer science, Data Mining can be referred to as knowledge mining from data, knowledge extraction, data/pattern analysis, data archaeology, and data dredging. There are other kinds of data like semi-structured or unstructured data which includes spatial data, multimedia data, text data, web data which require different methodologies for data mining.
- Mining Multimedia Data: Multimedia data objects include image data, video data, audio data, website hyperlinks, and linkages. Multimedia data mining tries to find out interesting patterns from multimedia databases. This includes the processing of the digital data and performs tasks like image processing, image classification, video, and audio data mining, and pattern recognition. Multimedia Data mining is becoming the most interesting research area because most of the social media platforms like Twitter, Facebook data can be analyzed through this and derive interesting trends and patterns.
- Mining Web Data: Web mining is essential to discover crucial patterns and knowledge from the Web. Web content mining analyzes data of several websites which includes the web pages and the multimedia data such as images in the web pages. Web mining is done to understand the content of web pages, unique users of the website, unique hypertext links, web page relevance and ranking, web page content summaries, time that the users spent on the particular website, and understand user search patterns. Web mining also finds out the best search engine and determines the search algorithm used by it. So it helps improve search efficiency and finds the best search engine for the users.
- Mining Text Data: Text mining is the subfield of data mining, machine learning, Natural Language processing, and statistics. Most of the information in our daily life is stored as text such as news articles, technical papers, books, email messages, blogs. Text Mining helps us to retrieve high-quality information from text such as sentiment analysis, document summarization, text categorization, text clustering. We apply machine learning models and NLP techniques to derive useful information from the text. This is done by finding out the hidden patterns and trends by means such as statistical pattern learning and statistical language modeling. In order to perform text mining, we need to preprocess the text by applying the techniques of stemming and lemmatization in order to convert the textual data into data vectors.
- Mining Spatiotemporal Data: The data that is related to both space and time is Spatiotemporal data. Spatiotemporal data mining retrieves interesting patterns and knowledge from spatiotemporal data. Spatiotemporal Data mining helps us to find the value of the lands, the age of the rocks and precious stones, predict the weather patterns. Spatiotemporal data mining has many practical applications like GPS in mobile phones, timers, Internet-based map services, weather services, satellite, RFID, sensor.
- Mining Data Streams: Stream data is the data that can change dynamically and it is noisy, inconsistent which contain multidimensional features of different data types. So this data is stored in NoSql database systems. The volume of the stream data is very high and this is the challenge for the effective mining of stream data. While mining the Data Streams we need to perform the tasks such as clustering, outlier analysis, and the online detection of rare events in data streams.