dbt Testing: A Complete Guide to Data Quality in Your Transformation Layer

How to build a comprehensive data quality testing strategy with dbt — built-in generic tests, singular tests, dbt-expectations for advanced assertions, test coverage strategy, and how to structure testing so it catches real data quality failures without creating maintenance overhead.

dbt testing is the primary mechanism for enforcing data quality in the transformation layer. A well-designed test suite catches data quality failures before they reach BI tools and business users. A poorly designed one creates maintenance overhead without catching real problems. This guide covers the full testing toolkit, how to apply it strategically, and the patterns that distinguish production-quality dbt projects from ones that give testing a bad name.

The Two Types of dbt Tests

dbt has two categories of test:

**Generic tests** (also called schema tests) are reusable test definitions applied to columns or models in schema.yml. You declare the assertion once and dbt generates the SQL. Built-in generic tests are not_null, unique, accepted_values, and relationships.

**Singular tests** are individual SQL files in the tests/ directory. Each file contains a query that returns rows when a test fails (zero rows = pass). These are for complex assertions that the generic framework cannot express.

Both types run with dbt test or dbt build. Both generate test results that can be stored in the warehouse for quality dashboards. Both can be configured with severity (warn vs error) and thresholds.

Built-In Generic Tests

**not_null** is the most frequently used test. Apply it to any column that must be populated — primary keys, foreign keys, required business attributes:

columns:

- name: order_id

tests:

- not_null

**unique** validates that a column contains no duplicate values. Apply to primary keys and any column asserted to have unique cardinality:

columns:

- name: order_id

tests:

- unique

Together, not_null and unique on a primary key column are the minimum test requirement for every model. They validate the grain — that the model actually has the rows and uniqueness you designed for.

**accepted_values** validates that a column only contains values from a defined set:

columns:

- name: status

tests:

- accepted_values:

values: ['pending', 'processing', 'shipped', 'delivered', 'cancelled']

Use accepted_values for status columns, type columns, category columns. It catches upstream system changes (a new enum value appearing in the source) before they silently produce unmapped records in downstream models.

**relationships** validates referential integrity — that every value in a column exists as a value in another model's column:

columns:

- name: customer_id

tests:

- relationships:

to: ref('dim_customers')

field: customer_id

Apply relationships tests to every foreign key column in fact tables. An order with a customer_id that does not exist in the customer dimension will produce NULL joins in BI tools — a silent data quality failure.

dbt-expectations

The dbt-expectations package (maintained by the dbt community) extends the generic test framework with a much richer assertion library modelled after Great Expectations. Install via packages.yml:

packages:

- package: calogica/dbt_expectations

version: [">=0.10.0", "<0.11.0"]

Key tests from dbt-expectations:

**expect_column_values_to_be_between** validates numeric bounds — useful for catching obvious data errors like negative prices, ages over 150, or percentages outside 0-100:

- dbt_expectations.expect_column_values_to_be_between:

min_value: 0

max_value: 100000

**expect_column_proportion_of_unique_values_to_be_between** validates the uniqueness ratio — useful for columns that should have high cardinality but not necessarily strict uniqueness:

- dbt_expectations.expect_column_proportion_of_unique_values_to_be_between:

min_value: 0.95

**expect_table_row_count_to_be_between** validates that a model produces a reasonable number of rows. Catches empty loads, partial failures, and incorrect filter logic that drops most records:

- dbt_expectations.expect_table_row_count_to_be_between:

min_value: 10000

max_value: 50000000

**expect_column_values_to_match_regex** validates format patterns — email addresses, phone numbers, date strings, product SKU formats:

- dbt_expectations.expect_column_values_to_match_regex:

regex: "^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$"

**expect_column_pair_values_a_to_be_greater_than_b** validates cross-column relationships — end dates after start dates, delivery dates after order dates, revenue greater than cost:

- dbt_expectations.expect_column_pair_values_a_to_be_greater_than_b:

column_A: delivered_at

column_B: ordered_at

Singular Tests for Complex Assertions

When generic tests cannot express the assertion, write a singular test. A singular test is a SQL query that returns rows on failure. If the query returns zero rows, the test passes.

Common patterns:

**Orphaned records** — rows in a staging table that join to nothing in a reference table (more nuanced than a relationships test, which fails on any mismatch):

select o.order_id

from {{ ref('stg_orders') }} o

left join {{ ref('dim_customers') }} c using (customer_id)

where c.customer_id is null

and o.created_at >= '2024-01-01' -- scope to recent records

**Duplicate grain assertion** for multi-column grain keys:

select order_id, product_id, count(*) as cnt

from {{ ref('fct_order_lines') }}

group by 1, 2

having cnt > 1

**Referential count mismatch** — checking that aggregated subtotals equal a known total:

select

sum(line_item_amount) as line_total,

header_amount

from {{ ref('fct_order_lines') }}

join {{ ref('fct_orders') }} using (order_id)

group by order_id, header_amount

having abs(line_total - header_amount) > 0.01

Test Coverage Strategy

A common mistake is applying every test to every column. This creates two problems: slow test runs that developers skip, and alert fatigue from low-signal test failures.

A more effective approach is tiered test coverage:

**Tier 1 — Critical models (always test):** Fact tables, certified dimension tables, models consumed directly by BI tools. Apply: not_null and unique on grain keys, relationships on all foreign keys, row count bounds, critical field not_null. These tests should block deployment if they fail.

**Tier 2 — Important models (test key assertions):** Core staging models, important intermediate models. Apply: not_null on primary keys, unique on primary keys, accepted_values on status/type columns. Failure triggers a warning, not a block.

**Tier 3 — Supporting models (minimal testing):** Utility models, intermediate models that feed Tier 1/2 models (which have their own tests). Apply: not_null on primary key only, or no tests if the downstream model provides sufficient coverage.

Test Severity and Thresholds

dbt tests can be configured with severity:

- not_null:

config:

severity: warn

warn logs a failure without stopping the build. error stops the build. Use warn for tests that indicate data quality issues worth investigating but not deployment blockers — for example, a relationships test on a dimension that is occasionally missing new records before a daily sync.

dbt-expectations tests support failure thresholds:

- dbt_expectations.expect_column_values_to_not_be_null:

config:

warn_if: ">5"

error_if: ">100"

This allows graduated response: a small number of nulls triggers a warning, a large number triggers an error. Useful for columns where a small proportion of nulls is acceptable (optional fields in source systems) but a high proportion indicates a pipeline failure.

Storing Test Results

By default, dbt test results are logged but not persisted. To store test results in the warehouse for data quality dashboards, use dbt's test result storage:

In dbt_project.yml:

flags:

store_failures: true

Or per-test:

- not_null:

config:

store_failures: true

Failed test rows are stored in a schema (default: dbt_test__audit) as tables named after the test. You can query these tables to build data quality dashboards showing failure rates over time, failure counts by model, and trending data quality metrics.

Common Testing Mistakes

**Testing source data with dbt tests.** dbt source tests (defined in sources.yml) test your source tables. Source tables are outside your control. Test them with warn severity to surface source quality issues without blocking your own models from building.

**Not testing incremental models for grain integrity.** Incremental models that merge on a unique key can produce duplicates if the merge key is not truly unique in the incoming data. Always apply a unique test to the merge key of incremental models, and test the full table rather than just the new increment.

**Writing tests that always pass.** A test that cannot conceivably fail against your current data is not testing anything. Review your accepted_values lists and row count bounds regularly to ensure they reflect real constraints, not just the current state of the data.

**Missing relationship tests on optional foreign keys.** A nullable foreign key column can still have referential integrity violations — non-null values that do not exist in the referenced table. Apply the relationships test with where: "column_name is not null" to test only non-null values.

A well-tested dbt project should catch grain violations, null key failures, referential integrity breaks, and obvious out-of-bounds data values before they reach BI tools. The goal is not 100% column coverage — it is high-confidence signal that your core analytical models are trustworthy.

Our data engineering consulting practice helps teams design testing strategies that catch real failures without slowing development — contact us to discuss your dbt environment.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →