Arikah Map

Snowflake schema

The snowflake schema is a variation of the star schema used in a data warehouse. The snowflake and star schema are methods of storing data which are multidimensional in nature (i.e. which can be analysed by any or all of a number of independent factors) in a relational database

The snowflake schema (sometimes callled snowflake join schema) is a more complex schema than the star schema because the tables which describe the dimensions are normalized.


Contents

Data normalization and storage

Normalization splits up data to avoid redundancy (duplication) by moving commonly repeating groups of data into a new table. Normalization therefore tends to increase the number of tables that need to be joined in order to perform a given query, but reduces the space required to hold the data and the number of places where it needs to be updated if the data changes.

In a data warehouse, the fact table in which data values (and its associated indexes) are stored, is typically responsible for 90% or more of the storage requirements, so the benefit here is normally insignificant.

However, normalization of the dimension tables ("snowflaking") can impair the performance of a data warehouse. Whereas conventional databases can be tuned to match the regular pattern of usage, such patterns rarely exist in a data warehouse. Snowflaking will increase the time taken to perform a query, and the design goals of many data warehouse projects is to minimize these response times.

Also, many data warehouses are designed to be used by business users, and the added complexity of the snowflake schema will often preclude non-specialist users from forming their own queries.

Benefits of "snowflaking"

In practice, many data warehouses will normalize some dimensions and not others, and hence use a combination of snowflake and classic star schema.

See also

References

Categories


Wikipedia articles needing clarification | Wikipedia articles needing context | Data warehousing

Find

Find

Find