A Type 2 Slowly Changing Dimension (SCD Type 2) preserves the full history of dimension attribute changes by adding a new row for each change with effective date tracking. This guide explains when Type 2 is necessary, how to implement it, and the impact on BI queries.
A Type 2 Slowly Changing Dimension (SCD Type 2) is the standard technique for preserving the history of dimension attribute changes in a data warehouse. When a customer moves from one sales territory to another, or a product is reclassified from one category to another, a Type 2 implementation creates a new dimension row for the new state rather than overwriting the existing one. Historical fact rows continue to reference the old dimension row — the one with the attribute values as they were at the time the fact was recorded.
This is the mechanism that makes it possible to ask "what was the sales territory of each customer at the time of each purchase?" and get an accurate answer — even though customers have since been reassigned to different territories. Without Type 2 history, the same question returns current territory for all historical purchases, which is analytically wrong.
The Three Core Columns
Type 2 implementation requires three additional columns beyond the standard dimension attributes:
**Effective start date (row_effective_date or dbt_valid_from):** The date and time from which this row's attribute values are the current state. For the first record of a new entity, this is typically the entity's creation date or the first time it appeared in the source system. For subsequent records representing changes, this is the date of the change.
**Effective end date (row_expiry_date or dbt_valid_to):** The date and time at which this row's attributes were superseded by a new version. When a change occurs, the current row's end date is set to the change date; the new row has this date as its start date. The currently active row has an effective_end_date of NULL or a sentinel far-future date (9999-12-31 by convention).
**Is_current (or is_active) flag:** A boolean that is true for the currently active row for each entity, false for historical rows. Simplifies queries that want only current dimension state — eliminates the need to construct date range logic for "give me current attributes."
How Type 2 Works in Practice
Initial state: Customer ABC Corp is assigned to the Northeast territory. Their dimension row has surrogate key 1001, customer_id = 'ACCT-00891', sales_territory = 'Northeast', effective_start = 2023-01-01, effective_end = NULL, is_current = true.
Change event: ABC Corp is reassigned to the Mid-Atlantic territory effective 2024-06-01.
Type 2 response:
1. The existing row (surrogate key 1001) is updated: effective_end = 2024-06-01, is_current = false
2. A new row is inserted: surrogate key 1002, customer_id = 'ACCT-00891', sales_territory = 'Mid-Atlantic', effective_start = 2024-06-01, effective_end = NULL, is_current = true
Fact table rows with order_date before 2024-06-01 were created with customer_key = 1001. Fact rows after 2024-06-01 use customer_key = 1002. A query joining fact to customer dimension returns 'Northeast' for pre-June orders and 'Mid-Atlantic' for post-June orders — correct historical attribution.
Querying Type 2 Dimensions
The complication introduced by Type 2 is query complexity. A query for "current performance by territory" simply filters on is_current = true. A query for "historical performance by territory as of each transaction date" must join on the surrogate key — which is already done correctly if the fact table was loaded with the correct surrogate key at load time.
The risk is when analysts write queries directly against the dimension without using the fact table's surrogate key:
Wrong: JOIN customer ON fact.customer_id = dim.customer_id WHERE dim.is_current = true
Right: JOIN customer ON fact.customer_key = dim.customer_key
The wrong query always returns the current territory, regardless of when the fact occurred. The right query returns the territory as it was when the fact was loaded.
BI tools like Tableau and Power BI join dimension tables through the surrogate key relationship defined in the data model — this is correct by construction when the data model is set up properly. Direct SQL queries by analysts are where the incorrect join most commonly appears.
When Type 2 Is Required (and When It Is Not)
Type 2 is required when the analytical question "as it was at the time" matters for the attribute in question.
Requires Type 2:
- Sales territory attribution in revenue reports (was this sale credited to the right territory when it occurred?)
- Customer segment for cohort analysis (was this customer in the mid-market segment when they first purchased?)
- Product category for historical product mix analysis (what was the category classification when this product was selling?)
- Employee department for HR analytics (which department was this employee in when they closed this deal?)
Does not require Type 2 (Type 1 overwrite is appropriate):
- Data entry corrections (the customer's name was misspelled and has been corrected)
- Attributes that are definitionally current-state only (current billing address for sending renewal invoices)
- Low-impact attributes where tracking history adds no analytical value
dbt Implementation
dbt provides two mechanisms for Type 2 SCD implementation:
**dbt snapshots:** dbt's built-in snapshot functionality runs a snapshot command against a source table, comparing the current state to the previously snapshotted state. For changed rows, dbt automatically manages the effective date columns (dbt_valid_from, dbt_valid_to, dbt_updated_at) and generates new rows. Snapshots are the recommended approach for source tables where dbt has access to query the current state on a schedule.
**Manual SCD model:** Some implementations manage Type 2 logic in a dbt incremental model — comparing the incoming source state to the current dimension state and generating the insert/update actions. More complex but more flexible for environments where snapshot conditions are not met.
Our data architecture services practice designs and implements dimensional models including SCD Type 2 dimensions via dbt snapshots, ensuring accurate historical attribution for revenue, customer, and product analysis. Contact us to discuss your data warehouse modeling requirements.
A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.
Book a Call →