Introduction: The snowflake schema is a variant of the star schema. Here, the centralized fact table is connected to multiple dimensions. In the snowflake schema, dimension are present in a normalized from in multiple related tables. The snowflake structure materialized when the dimensions of a star schema are detailed and highly structured, having several levels of relationship, and the child tables have multiple parent table. The snowflake effect affects only the dimension tables and does not affect the fact tables.
The Employee dimension table now contains the attributes: EmployeeID, EmployeeName, DepartmentID, Region, Territory. The DepartmentID attribute links with Employee table with the Department dimension table. The Department dimension is used to provide detail about each department, such as Name and Location of the department. The Customer dimension table now contains the attributes: CustomerID, CustomerName, Address, CityID. The CityID attributes links the Customer dimension table with the City dimension table. The City dimension table has details about each city such as CityName, Zipcode, State and Country.
The main difference between star schema and snowflake schema is that the dimension table of the snowflake schema are maintained in normalized form to reduce redundancy. The advantage here is that such table(normalized) are easy to maintain and save storage space. However, it also means that more joins will be needed to execute query. This will adversely impact system performance.
What is snowflaking?
The snowflake design is the result of further expansion and normalized of the dimension table. In other words, a dimension table is said to be snowflaked if the low-cardinality attribute of the dimensions have been divided into separate normalized tables. These tables are then joined to the original dimension table with referential constrains(foreign key constrain).
Generally, snowflaking is not recommended in the dimension table, as it hampers the understandability and performance of the dimension model as more tables would be required to be joined to satisfy the queries.
Characteristics of snowflake schema: The dimension model of snowflake under the following conditions:
- The snowflake schema uses small disk space.
- It is easy to implement dimension is added to schema.
- There are multiple tables, so performance is reduced.
- The dimension table consist of two or more sets of attributes which define information at different grains.
- The sets of attributes of the same dimension table are being populate by different source systems.
Advantages: There are two main advantages of snowflake schema given below:
- It provides structured data which reduces the problem of data integrity.
- It uses small disk space because data are highly structured.
- Snowflaking reduces space consumed by dimension tables, but compared with the entire data warehouse the saving is usually insignificant.
- Avoid snowflaking or normalization of a dimension table, unless required and appropriate.
- Do not snowflake hierarchies of one dimension table into separate tables. Hierarchies should belong to the dimension table only and should never be snowfalked.
- Multiple hierarchies can belong to the same dimension has been designed at the lowest possible detail.
Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.
- Difference between Star Schema and Snowflake Schema
- Difference between Snowflake Schema and Fact Constellation Schema
- Star Schema in Data Warehouse modeling
- Types of Keys in Data Warehouse Schema
- Difference between Star Schema and Fact Constellation Schema
- Difference between Data Warehouse and Data Mart
- Data Mining: Data Warehouse Process
- Difference between Data Lake and Data Warehouse
- Characteristics and Functions of Data warehouse
- Data Warehouse Architecture
- ETL Process in Data Warehouse
- Fact Constellation in Data Warehouse modelling
- Building a Data Warehouse in DBMS
- Difference between Database System and Data Warehouse
- Differences between Operational Database Systems and Data Warehouse
- Testing in Data warehouse
- Difference between Data Warehouse and Hadoop
- Mapping from ER Model to Relational Model
- Difference between Bottom-Up Model and Top-Down Model
- Difference between E-R Model and Relational Model in DBMS
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.
Improved By : Akanksha_Rai