A data-warehouse is a heterogeneous collection of different data sources organised under a unified schema. There are 2 approaches for constructing data-warehouse: Top-down approach and Bottom-up approach are explained as below.
1. Top-down approach:
The essential components are discussed below:
- External Sources –
External source is a source from where data is collected irrespective of the type of data. Data can be structured, semi structured and unstructured as well.
- Stage Area –
Since the data, extracted from the external sources does not follow a particular format, so there is a need to validate this data to load into datawarehouse. For this purpose, it is recommended to use ETL tool.
- E(Extracted): Data is extracted from External data source.
- T(Transform): Data is transformed into the standard format.
- L(Load): Data is loaded into datawarehouse after transforming it into the standard format.
- Data-warehouse –
After cleansing of data, it is stored in the datawarehouse as central repository. It actually stores the meta data and the actual data gets stored in the data marts. Note that datawarehouse stores the data in its purest form in this top-down approach.
- Data Marts –
Data mart is also a part of storage component. It stores the information of a particular function of an organisation which is handled by single authority. There can be as many number of data marts in an organisation depending upon the functions. We can also say that data mart contains subset of the data stored in datawarehouse.
- Data Mining –
The practice of analysing the big data present in datawarehouse is data mining. It is used to find the hidden patterns that are present in the database or in datawarehouse with the help of algorithm of data mining.
This approach is defined by Inmon as – datawarehouse as a central repository for the complete organisation and data marts are created from it after the complete datawarehouse has been created.
Advantages of Top-Down Approach –
- Since the data marts are created from the datawarehouse, provides consistent dimensional view of data marts.
- Also, this model is considered as the strongest model for business changes. That’s why, big organisations prefer to follow this approach.
- Creating data mart from datawarehouse is easy.
Disadvantages of Top-Down Approach –
- The cost, time taken in designing and its maintainence is very high.
2. Bottom-up approach:
- First, the data is extracted from external soures (same as happens in top-down approach).
- Then, the data go through the staging area (as explained above) and loaded into data marts instead of datawarehouse. The data marts are created first and provide reporting capability. It addresses a single business area.
- These data marts are then integrated into datawarehouse.
This approach is given by Kinball as – data marts are created first and provides a thin view for analyses and datawarehouse is created after complete data marts have been created.
Advantages of Bottom-Up Approach –
- As the data marts are created first, so the reports are quickly generated.
- We can accomodate more number of data marts here and in this way datawarehouse can be extended.
- Also, the cost and time taken in designing this model is low comparatively.
Disadvantage of Bottom-Up Approach –
- This model is not strong as top-down approach as dimensional view of data marts is not consistent as it is in above approach.
Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready.
- Difference between Data Lake and Data Warehouse
- Data Mining: Data Warehouse Process
- Difference between Data Warehouse and Data Mart
- Testing in Data warehouse
- ETL Process in Data Warehouse
- Difference between Data Warehouse and Hadoop
- Characteristics and Functions of Data warehouse
- Building a Data Warehouse in DBMS
- Star Schema in Data Warehouse modeling
- Fact Constellation in Data Warehouse modelling
- Snowflake Schema in Data Warehouse Model
- Types of Keys in Data Warehouse Schema
- Difference between Database System and Data Warehouse
- Differences between Operational Database Systems and Data Warehouse
- Data Architecture Design and Data Management
- NoSQL Data Architecture Patterns
- Types and Part of Data Mining architecture
- RDBMS Architecture
- Architecture of Apache Cassandra
- Introduction of 3-Tier Architecture in DBMS | Set 2
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.