Data Warehouse Testing: How to Validate Data Quality, Accuracy, and Completeness

A data warehouse without automated testing is a data warehouse that erodes trust every time a silent pipeline failure produces wrong numbers. Here is the comprehensive testing strategy.

The quick answer

Testing a data warehouse means verifying that the data it contains is correct, complete, consistent, and timely — at every stage from ingestion through transformation to serving. Without automated testing, data quality issues propagate silently until analysts discover them in dashboards. The testing strategy has three layers: source-level freshness and schema validation, transformation-layer correctness tests (dbt tests), and end-to-end reconciliation against source system totals. This guide covers the complete testing framework.

Why data warehouse testing is different from application testing

In application testing, you test that code produces correct output given known inputs. In data warehouse testing, the "inputs" are source systems outside your control — operational databases that can change schema, add columns, change data types, or produce unexpected values at any time. The "code" (SQL transformations) can have logical errors that produce incorrect aggregations, incorrect joins, or incorrect business rule implementations.

The failure modes are different too. A production bug in an application is usually immediately visible — the feature breaks, users report it. A data warehouse bug can silently produce incorrect numbers for weeks or months. Nobody notices until a VP is in a board meeting with a revenue number that does not match the finance system. By then, the root cause is buried in weeks of pipeline history.

The testing strategy addresses both failure modes: source system change detection (schema tests, freshness tests) and transformation correctness (data quality tests, reconciliation).

Layer 1: Source validation

Source validation tests run when data is first loaded into the warehouse — before any transformation. These tests detect issues in the source data or the ingestion process.

**Freshness tests**: verify that data was delivered within the expected window. Configure freshness expectations on every source in dbt's sources.yml: warn_after (how long before a warning is triggered), error_after (how long before an error is triggered). Run dbt source freshness on each pipeline execution. A source that has not updated in 25 hours when it is expected hourly is a detectable, alertable condition — not a silent gap.

**Row count validation**: verify that the loaded row count falls within an expected range. A daily load of customer records that produces 0 rows is clearly wrong; one that produces 10x the expected count is also suspicious. Great Expectations and Soda support row count assertions; dbt-expectations adds expect_table_row_count_to_be_between.

**Schema validation**: verify that the source table's columns match the expected schema. When a source system renames or drops a column, the downstream dbt model that references that column fails — but only when the model runs. Schema tests fail immediately at source load time, before transformation runs and before downstream models break. Configure column existence tests on source tables for critical fields.

**Null rate monitoring**: for columns that should be almost never null (customer_id, order_id, transaction_amount), alert when the null rate exceeds a threshold. A column that was 0.1% null and is now 15% null indicates an upstream bug — catchable before it corrupts downstream aggregations.

Layer 2: Transformation correctness tests

Transformation tests run after dbt models build, verifying that the transformation logic produces correct output. These are the tests that catch SQL bugs, join errors, and business logic mistakes.

**Primary key tests**: every mart model's primary key must be unique and not null. These are the minimum required tests — if a fact table has duplicate order_id rows, every downstream metric that aggregates by order is wrong. Add not_null and unique to the primary key of every mart model.

**Foreign key relationships**: every foreign key must reference an existing primary key. A fact table row with a customer_id that does not exist in the customers dimension is an orphaned record — it produces nulls in joined dashboards and incorrect customer-level aggregations. Use the relationships test to assert referential integrity for every foreign key in every fact table.

**Accepted values**: for status columns, category columns, and any column with a defined set of valid values, assert that only those values appear. If order_status should only contain 'pending', 'fulfilled', 'cancelled', 'refunded', a row with status = 'FULFILLED' (wrong case) or status = 'completed' (unexpected value) indicates either a source system change or a transformation bug.

**Business rule assertions**: custom singular tests (SQL queries that return zero rows when data is valid) verify business rules that generic tests cannot capture. Examples:

- Total revenue should never be negative

- Order date should never be in the future

- Refund amount should never exceed original order amount

- Monthly active user count should not drop more than 50% month-over-month (a sudden drop indicates a pipeline bug, not a business event)

**Statistical drift tests**: for large, slowly changing tables, test that aggregate metrics stay within statistical norms. A daily load of transaction data should produce a revenue total within 2 standard deviations of recent daily averages. Sudden large swings indicate ingestion failures or logic errors.

Layer 3: End-to-end reconciliation

Reconciliation tests compare warehouse aggregates to source system totals. This is the highest-confidence test — if the warehouse total matches the source system total, the full pipeline from ingestion through transformation is correct.

**Revenue reconciliation**: daily revenue in the warehouse should match daily revenue in the billing system (within a tolerance for timing differences — orders placed near midnight may appear in different days). Query both systems and compare.

**Record count reconciliation**: total customer count in the warehouse should match total active customer count in the CRM, minus the known filters (inactive accounts excluded, test accounts excluded). Document the expected delta and assert that the actual delta stays within bounds.

**Running these tests**: reconciliation queries require access to both the warehouse and the source system API or a read replica. Run them as SQL tasks in Airflow or dbt (as singular tests calling external data), alert when the delta exceeds tolerance.

Test execution and alerting

**CI/CD integration**: dbt tests should run in CI on every PR that modifies a model or its upstream. Use dbt's --select state:modified+ to run only tests on modified models and their downstream dependents. PR merges that fail tests should not be allowed.

**Production test schedule**: run full dbt test suite after every successful dbt run in production. Failed tests trigger alerts via Slack, PagerDuty, or email based on severity. Critical test failures (null primary key, referential integrity failure) should page on-call; warning-severity failures notify the relevant team.

**Test result tracking**: write test results to a monitoring table (track pass/fail/warning per test per run). Dashboard this: test pass rate over time, frequency of failures by test, most commonly failing tests. This drives quality investment prioritisation.

**Incident response**: when a test fails, the response process should be pre-defined. Who is notified? What is the first triage step? Which downstream dashboards are affected? Having this documented before the first failure reduces incident resolution time significantly.

For the specific tooling for testing, see dbt best practices and data quality framework. For the observability layer that monitors beyond explicit tests, see data observability.

Our data architecture consulting practice designs and implements data warehouse testing strategies — from source validation through transformation tests and end-to-end reconciliation. If your data warehouse is producing incorrect numbers that you discover after the fact, book a free 30-minute audit to discuss your testing approach.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →