dbt Best Practices: How to Structure, Test, and Maintain a Production dbt Project

The patterns that separate maintainable dbt projects from ones that become technical debt: project structure, naming conventions, testing strategy, documentation, and performance.

The quick answer

Most dbt projects start clean and become technical debt within 12–18 months. The causes are predictable: inconsistent naming conventions, undertested models, oversized model files, undocumented sources, and a flat project structure that makes it impossible to understand what runs before what. This guide covers the practices that keep dbt projects maintainable at scale — project structure, naming conventions, testing strategy, documentation, and performance patterns.

Project structure

The most important structural decision is the layer pattern. The canonical dbt layer structure:

**Staging layer** (models/staging/): one model per source table, named stg_source__entity. Staging models do only light transformations: renaming columns to snake_case, casting types, parsing timestamps, basic null handling. No joins. No business logic. Staging models are the stable contract between raw source data and the rest of the project. When a source column renames or a type changes, you fix it in one place — the staging model — and everything downstream is protected.

**Intermediate layer** (models/intermediate/): joins, business logic, and complex transformations that span multiple sources but are not yet the final mart shape. Named int_entity__verb (e.g., int_orders__joined, int_customers__pivoted). Not exposed to BI tools. This layer prevents the common failure mode where mart models become 500-line SQL queries joining 15 tables — extract the complex joins into intermediates and keep mart models clean.

**Mart layer** (models/marts/): the final business-facing models. Wide, denormalized tables structured for BI consumption. Named by business domain and entity (fct_orders, dim_customers, dim_products). Expose these to BI tools. These are the models your analysts write dashboards against.

**Source definitions** (models/staging/sources.yml): define every raw source table in a sources.yml file. This is how dbt knows about upstream data — it validates that source tables exist, tracks freshness (source freshness tests flag when source data is late), and generates documentation and lineage for raw sources.

Keep these subdirectories under models/staging/ organised by source system: models/staging/salesforce/, models/staging/postgres/, models/staging/stripe/. This makes it easy to find all models related to a given source.

Naming conventions

**Files and models**: snake_case throughout. No mixed case, no hyphens. Model names match the file names. The prefix conventions (stg_, int_, fct_, dim_) encode the layer, which makes grep-based navigation and cross-referencing in documentation much faster.

**Columns**: snake_case. Boolean columns prefixed with is_ or has_ (is_active, has_subscription). Date columns suffixed with _date (created_date, updated_date). Timestamp columns suffixed with _at (created_at, updated_at). IDs suffixed with _id (customer_id, order_id), and be consistent about whether IDs are integers or strings — do not mix.

**Tests**: use schema.yml files co-located with your models. Define not_null, unique, accepted_values, and relationships tests on every key column of every mart. The effort is low; the benefit is catching data issues before they reach dashboards.

**Macros**: snake_case, prefixed to signal purpose. Utility macros: clean_string, generate_surrogate_key. Custom test macros: test_is_positive, test_date_in_range. Do not write macros for things that SQL handles cleanly in a model.

Testing strategy

**The baseline**: every mart model should have at minimum: unique and not_null on the primary key; not_null on all foreign keys; accepted_values on status/type columns that have a known enum. This baseline catches the most common data quality failures.

**Relationships tests**: use the relationships test to assert that every foreign key in a fact table exists in the corresponding dimension. This surfaces join issues before analysts discover them in dashboards with unexpected nulls.

**Custom data tests**: for business-logic assertions that go beyond the built-in generic tests, write custom singular tests (SQL queries that should return zero rows when data is valid). Place them in the tests/ directory. A test asserting that total_revenue is always positive, or that order dates are never in the future, is simple to write and catches real issues.

**dbt-expectations**: the dbt-expectations package (inspired by Great Expectations) adds a library of parameterised tests: expect_column_values_to_be_between, expect_column_mean_to_be_between, expect_table_row_count_to_be_between. Use these for numerical validation that built-in generic tests do not cover.

**Test severity**: configure severity: warn vs severity: error at the model or project level. Tests that should fail the pipeline: error. Tests where a warning in CI is sufficient: warn. Use this to avoid false-positive build failures from noisy tests while still surfacing issues.

**Source freshness**: configure loaded_at_field and warn_after / error_after for every source in sources.yml. dbt source freshness checks will flag when source data is late, before downstream pipelines run on stale data.

Documentation

**The minimum**: every model in the mart layer should have a description in its schema.yml. Every column in mart models that analysts query should have a description. This is non-negotiable if you have more than one person using the project.

**Column descriptions in schema.yml vs YAML anchors**: for columns that appear in many models (customer_id, order_id, created_at), use YAML anchors to define the description once and reference it across models. This keeps documentation consistent and reduces maintenance.

**dbt docs generate**: run dbt docs generate and dbt docs serve to generate and view the documentation site locally. In production, deploy the docs site to a static host (S3 + CloudFront, GitHub Pages) so analysts can browse the data catalog without running dbt locally.

**Exposures**: define exposures in exposures.yml for every downstream dashboard, report, or application that depends on your dbt models. Exposures show up in the dbt lineage graph, making it visible which dashboards will be affected by a model change. This is the closest thing dbt has to impact analysis.

Performance patterns

**Materialisation strategy**: the default materialisation is view — every model is a SQL view rebuilt on each query. Views are fine for staging and some intermediate models. Mart models that are queried frequently by BI tools should be materialised as tables (rebuilt on each dbt run) or incremental (only new rows processed on each run).

**Incremental models**: use incremental materialisation for large fact tables. The incremental logic filters the source to only new or updated records (typically using a timestamp column), inserts or merges them into the target table, and skips processing historical records on each run. This reduces run time dramatically for large datasets — a full-refresh of a 500M-row fact table takes hours; an incremental run processes only the last day's records in minutes.

**Unique key for incremental merge**: for incremental models where records can be updated (not just appended), define a unique_key. When unique_key is set, dbt uses a MERGE (or DELETE + INSERT) to handle updates rather than a simple INSERT. This prevents duplicate rows when upstream source records are updated.

**Partitioning and clustering**: in Snowflake and BigQuery, configure table clustering on columns frequently used as query filters (date columns, entity IDs). For mart models materialised as tables, clustering dramatically improves query performance for BI tool queries that filter by date range or customer.

**Avoiding select star**: in dbt models, avoid SELECT * from upstream models. Explicit column selection makes the model's output schema clear, prevents accidental column propagation, and helps downstream users understand what each model contains.

**Model size**: if a model's SQL exceeds 100–150 lines, consider whether it should be split into an intermediate model and a mart model. Large SQL is hard to review, hard to debug, and usually indicates mixed concerns in a single model.

CI/CD patterns

**Slim CI**: in CI pipelines, use dbt's state comparison to run only modified models and their downstream dependencies, rather than the full project. This dramatically reduces CI run time for large projects. The command pattern uses the state:modified selector against a production manifest artifact.

**Deferred execution**: in Snowflake and BigQuery, use dbt's defer flag in development to reference production model results for upstream models you have not modified. This avoids rebuilding the entire upstream graph for every local development run.

**PR checks**: CI should run dbt compile (syntax check), dbt test on the modified subset, and dbt source freshness. Fail the PR if tests fail. This is the minimum CI gate for a production dbt project.

Common failure patterns

**The staging-mart bypass**: models that directly join raw source tables without going through staging. When the source table renames a column, every model that bypassed staging breaks. Enforce the layer pattern.

**Business logic in staging**: staging models that contain joins, business rules, or complex calculations. Staging should be a thin transformation layer — renaming and casting only. Business logic belongs in intermediates and marts.

**No tests on key columns**: projects where tests were planned but not written, leaving data quality invisible until analysts notice anomalies in dashboards. Add at minimum unique + not_null on every primary key.

**Materialisation debt**: a project where everything is materialised as a table (slow builds) or everything is a view (slow queries). Audit the materialisation strategy and match it to usage patterns.

For the orchestration layer that runs dbt in production, see apache airflow guide. For the data ingestion that feeds dbt staging models, see fivetran vs airbyte. For the warehouse platforms dbt runs against, see snowflake architecture guide and snowflake vs redshift.

Our data architecture consulting practice designs and implements production dbt projects — from initial setup and layer structure through testing strategy and CI/CD integration. If your dbt project is becoming difficult to maintain or you are starting a new implementation, book a free 30-minute audit.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →