Does Dark Data Have Any Worth In The Big Data World?

Big Data is the new oil in modern times!!! And those companies that can analyze this data for actionable insights are the new super-rich!!! More and more companies are understanding this fact and investing in Big Data Analytics. So much so that this number has reached 53% in 2017, which is a huge growth from 17% in 2015.

Does-Dark-Data-Have-Any-Worth-In-The-Big-Data-World

But Big Data is of multiple types. There is the Critical Business Data, which is the data most often analyzed by businesses to complete their goals, increase their revenue, etc. This is the data that commonly comes to mind when we think of Big Data. Another type of data is the ROT data(redundant obsolete and trivial data). As the name suggests, this data is not important at all for a business and it can be discarded.



Then next we have Dark Data. This is a less famous cousin of Big Data that no one has ever heard of and that no one understands. So today, we’ll discuss Dark Data and try to understand its worth in the Big Data world. First, let’s address the fundamental question i.e. What is Dark Data?

What is Dark Data?

Most companies collect, process and store large amounts of data that may help them improve their goods and services in the future. Got a new Samsung phone? Samsung will likely collect your usage data. Have a Facebook account? Facebook collects your browsing data, your friend lists, etc. And this is true for almost all companies. After this data is collected, then Data Analytics comes into the picture!

But there is a large part of the collected data that cannot be analyzed using conventional data analytics. This data is known as Dark Data which has a massive amount of untapped potential. While Dark Data could provide immense insights for a company that could lead to higher profits and more business growth, it is mostly just stored in the company archives and not analyzed as much. That’s because it is very difficult to capture, identify and accurately analyze Dark Data.

Some common examples of Dark Data are given here:

  • Spreadsheets
  • Email attachments and .zip files that are ignored after a look
  • Inactive and old databases
  • Previous employee details
  • Log files
  • Analytics reports and survey data
  • Old versions of documents still available
  • Personal data additions like project notes

All these examples of Dark Data are items that are leftover and not considered important anymore. So this Dark Data is disregarded when it could, in fact, be mined for highly valuable insights.

What are the Different Dimensions of Dark Data?

Dark Data is basically divided into 3 dimensions which constitute the different types of Dark Data. So now let’s see what these are:

1. Traditional Unstructured Data

Nearly 80% of all available data in the world is Traditional Unstructured Data. And that’s a part of Dark Data. There’s obviously a lot of Dark Data in the world!!! Traditional Unstructured Data is basically data in a text-based form that is not organized in a predefined manner. This can include all sorts of data in an organization like emails, office documents, employee messages, etc. which do not have a uniform structure. So analyzing this data to obtain actionable insights is a very tough job for organizations.

2. Non-Traditional Unstructured Data

While Traditional Unstructured Data is mainly in text-based form, Non-Traditional Unstructured Data is even more complicated! This data is mostly made up of real-time applications like audio and video files. This form of Dark Data is even more difficult to analyze as the meaning of real-time data may change over time. And if this data is not analyzed in a timely manner, then it may even lose its value and become obsolete.


3. Deep Web Data

The data in the deep web isn’t easily accessible to anyone. You can’t really use Google to see it!!! And this deep web data is a part of Dark Data that is very difficult to access let alone analyze. And it is estimated that the size of the deep web is approximately 500 times bigger than the surface web which you normally explore. So there is a large amount of untapped potential in the Deep Web Data.

How to handle Dark Data for Maximum Benefit?

If most of the available data is Dark Data which can provide huge benefits to an organization, the question is how to handle this Dark Data to obtain those benefits? This is where Dark Analytics comes in!!! Dark Analytics involves Capturing Dark Data, Unlocking its potential and then Gaining actionable business intelligence.

The most difficult part of this is capturing the Dark Data. That’s because this data is not structured and uniform so modified systems are required to capture it. These systems should know what to look for and where to look for it. After the Dark Data is identified and captured, it is equally important to use a Big Data platform to unlock it and understand its secrets. Then a business intelligence solution can be created using this Dark Data which will increase the productivity and income of a company.

However, many companies avoid handling Dark Data because of its complexity. But this attitude needs to change if companies really want to increase their profits and open a new dimension in business by harnessing the power of Dark Data. Using Structured and Unstructured data together, companies can really obtain unimaginable results that will make the cost of Data Analytics more than worth it!!!



My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.




Article Tags :

1


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.