SCHEMAS AND DATA MODELLING IN POWER BI.
Source: Dev.to
Effective Data Modelling in Power BI
Effective data modelling is the cornerstone of powerful and performant Power BI reports. It involves structuring your data to optimize analysis, reporting, and efficient data processing.
Fact and Dimension Tables: The Building Blocks
Dimension (Lookup) Tables
- Customer Dimension: Customer Name, Address, City, Region
- Product Dimension: Product Name, Category, Sub‑Category, Brand
- Date Dimension: Year, Quarter, Month, Day of Week
- Location Dimension: Store Name, City, State, Country
Dimension tables are typically wider (more columns) but have fewer rows than fact tables. They are often normalized to reduce data redundancy.
Fact (Event) Tables
- Sales Fact: Order Quantity, Sales Amount, Discount, Unit Price (along with foreign keys to Customer, Product, Date, and Store dimensions)
- Web Traffic Fact: Page Views, Session Duration, Bounce Rate (along with foreign keys to User, Page, and Date dimensions)
Fact tables are usually deeper (more rows) but narrower (fewer columns) than dimension tables. They can grow very large as they record every event or transaction.
Star Schema
Key Characteristics
- Central Fact Table – contains all measurable data and foreign keys to every dimension table.
- Directly Linked Dimension Tables – each dimension table is linked straight to the fact table; no intermediate tables.
- Denormalized Dimensions – all attributes for a dimension reside in a single table, even if some could belong to a more granular dimension.
Advantages
- Easy to understand, design, and implement.
- Excellent query performance due to fewer joins; most queries only need to join the fact table with a few dimensions.
- Business users can navigate the model and grasp relationships quickly.
- Power BI’s VertiPaq engine is highly optimized for star schemas, yielding faster aggregations and calculations.
Disadvantages
- Denormalized dimensions can introduce data redundancy.
- Less efficient for very complex or deep hierarchical relationships within dimensions.
Snowflake Schema
Key Characteristics
- Dimension tables are normalized, breaking them into multiple related tables to reduce redundancy.
- Dimensions can have multiple levels of sub‑dimensions (e.g., a “Product” dimension linked to “Product Category” and “Product Subcategory” tables).
- Involves more tables than a star schema because of the normalization.
Advantages
- Normalization minimizes redundant data storage, beneficial for large dimension tables with repeating attributes.
- Improved data integrity due to normalized structures.
- Better suited for handling complex and deep hierarchical dimensions.
Disadvantages
- More complex to design, understand, and maintain because of the increased number of tables and joins.
- Queries often require additional joins, which can negatively impact performance on large datasets.
- Business users may find it harder to navigate and comprehend the relationships.
Relationships: Connecting the Dots
The Importance of Good Data Modelling
- Performance Optimization – A well‑structured model reduces the amount of data Power BI must process per query, leading to faster report loading, quicker interactions, and a smoother user experience. Poorly modelled data can cause slow reports, long refresh times, and even crashes.
- Accuracy and Consistency – Clear models ensure calculations and aggregations are performed correctly and consistently across all reports, minimizing ambiguity and the risk of incorrect insights.
- Ease of Use and Maintainability – Logical, intuitive models make it easier for developers to locate and use the right fields, and simplify maintenance and updates when underlying data sources change.
- Scalability – As data volumes grow, a well‑designed model scales more effectively, preventing performance bottlenecks and keeping Power BI viable for expanding analytical needs.
- Data Storytelling and Insight Generation – Good modelling presents data in a logical, easy‑to‑understand format, empowering users to extract meaningful insights and make informed decisions.