Schemas and data modelling in Power Bi, a begginers guide.
Source: Dev.to
Introduction
Power BI is a data analysis and visualization tool by Microsoft that helps turn raw data into charts, graphs, and reports. In this guide we’ll explore schemas and data modeling in Power BI, covering concepts such as star schema, snowflake schema, relationships, fact and dimension tables, and why good modeling is critical for performance and accurate reporting.
Key Definitions
- Relationship – Defines how tables relate to each other.
- Physical model – The actual representation of how data is stored in the database.
- Logical model – The representation of how the business interprets the stored data.
- Fact table – Stores numerical data used for analysis.
- Dimension table – Stores descriptive information, usually referenced by the fact table.
Schemas
Star Schema
In a star schema, the fact table sits at the center, surrounded by a number of dimension tables. This arrangement gives the database structure its name.
Snowflake Schema
A snowflake schema normalizes dimension tables into additional related tables, creating a more complex, branching structure compared to the star schema.
Relationships
- One‑to‑one (1:1) – A column has only one instance of a value, and the related table also has only one instance of that value.
- Many‑to‑one (M:1) – Multiple rows in one table correspond to a single row in the related table.
- Many‑to‑many (M:M) – Multiple rows in each of the two tables can relate to multiple rows in the other table.
Modeling Approaches
- Normalized model – Creates more tables with fewer columns, reducing redundancy and inconsistencies.
- Denormalized model – Combines tables to reduce the number of joins, which can improve query performance in certain scenarios.
Practical Example
Consider a student data set. In a star schema, you might have a FactStudentPerformance table containing numeric scores, surrounded by dimension tables such as DimStudent, DimCourse, and DimTerm.
Importance of Good Modeling
Good modeling is critical for performance and accurate reporting. It ensures that analyses are reliable, queries run efficiently, and the data reflects the business logic correctly.
Enjoy your modeling journey!