BlogData Engineering

dbt CI/CD: Building a Production Deployment Pipeline

Obed Tsimi
Obed Tsimi
Founder & Senior Tableau Architect
·February 21, 202712 min read

How to build a CI/CD pipeline for dbt that automatically tests changes before they reach production, enforces code review, and deploys reliably — covering GitHub Actions configuration, the slim CI pattern with state-modified selection, environment management, and the operational practices that make dbt deployments reliable at scale.

A dbt project without CI/CD relies on individuals to manually test their changes, manually coordinate deployments, and manually notify downstream consumers when something breaks. At small scale, this works. As the project grows, as more engineers contribute, and as more business decisions depend on the models, manual processes become the primary source of production data quality failures. This guide covers how to build a CI/CD pipeline for dbt that catches problems before they reach production.

The Target State

A production-grade dbt CI/CD pipeline looks like this:

1. Engineer makes changes on a feature branch

2. Engineer opens a pull request

3. CI pipeline automatically runs: compiles the project, runs dbt tests on a development environment, checks for breaking schema changes, runs lint checks

4. Reviewer reviews the code and CI results

5. On approval and merge to main, CD pipeline automatically deploys: runs dbt in production, runs full test suite, alerts on failures

The goal: no change reaches production without automated testing, and no human manually runs dbt in production.

The CI Pipeline

### What CI Checks

**Compile check:** Verifies that the modified models compile without syntax errors and that all references (ref(), source()) resolve to existing models and sources. This is the cheapest check and catches a large proportion of mistakes.

**Slim CI run:** Runs dbt test on only the modified models and their downstream dependencies using the state:modified+ selector. This tests the changes without running the full DAG, keeping CI fast.

The slim CI pattern requires access to the production manifest.json — the compiled artifact from the last successful production run. dbt compares the current project against the production manifest to identify which models changed. Most CI setups store the production manifest as a CI artefact or in cloud storage.

**Schema change detection:** Identifies breaking changes — column removals, type changes, grain changes — that will break downstream consumers. dbt's contract enforcement feature (dbt 1.5+) provides some protection; augmenting with a schema comparison tool (datafold, or a custom comparison using INFORMATION_SCHEMA) catches changes before they deploy.

**Lint checks:** Style and formatting consistency. SQLFluff with a dbt-compatible configuration enforces consistent SQL style across all contributors.

### GitHub Actions Configuration

A minimal GitHub Actions workflow for dbt CI:

name: dbt CI

on:

pull_request:

branches: [main]

jobs:

dbt-ci:

runs-on: ubuntu-latest

steps:

- uses: actions/checkout@v3

- name: Set up Python

uses: actions/setup-python@v4

with:

python-version: '3.11'

- name: Install dbt

run: pip install dbt-snowflake==1.7.0

- name: dbt deps

run: dbt deps

- name: Download production manifest

run: aws s3 cp s3://my-dbt-artifacts/manifest.json target/manifest.json

- name: dbt compile

run: dbt compile

- name: dbt test (slim CI)

run: dbt test --select state:modified+ --defer --state target/

The --defer flag tells dbt to use the production environment for unmodified upstream models — the CI environment only needs to materialise the changed models, referencing production for everything else.

### CI Environment Setup

The CI environment should be isolated from production. Options:

**Schema-based isolation:** Create a development schema in the production warehouse for CI (ci_pr_123, ci_pr_124). Each PR gets its own schema, preventing CI runs from different PRs from interfering. Clean up old CI schemas after merge.

**Separate warehouse:** For complete isolation, use a separate development warehouse that runs CI. This adds cost (compute) but ensures CI cannot affect production performance.

**CI credentials:** Use a service account with write access to the CI schema and read access to production sources (for the --defer pattern). Do not use personal credentials in CI.

The CD Pipeline

### Production Deployment

The production deployment pipeline runs on merge to main:

name: dbt Production Deploy

on:

push:

branches: [main]

jobs:

dbt-prod:

runs-on: ubuntu-latest

steps:

- uses: actions/checkout@v3

- name: Install dbt

run: pip install dbt-snowflake==1.7.0

- name: dbt deps

run: dbt deps

- name: dbt source freshness

run: dbt source freshness

- name: dbt run

run: dbt run --target prod

- name: dbt test

run: dbt test --target prod

- name: Upload production manifest

run: aws s3 cp target/manifest.json s3://my-dbt-artifacts/manifest.json

The manifest upload at the end of a successful run provides the updated baseline for the next CI comparison.

### Failure Handling and Alerting

Production dbt run failures should alert immediately. Options:

**GitHub Actions native notifications:** Configure the workflow to send notifications to Slack or email on failure. GitHub provides native action steps for Slack notifications.

**dbt Cloud:** If using dbt Cloud, built-in alerting sends notifications to Slack or email on run failures. Configure alerts for all job failures, not just test failures.

**PagerDuty integration:** For critical production pipelines with SLAs, integrate failure alerting with PagerDuty to ensure on-call engineers are notified.

Environment Strategy

### Development → Staging → Production

A robust CD pipeline has three environments:

**Development:** Individual engineer environments. Engineers develop and test locally or in a personal development schema. dbt's target variable (profiles.yml) determines which schema to write to. Engineers run dbt compile and dbt run locally against a development schema before opening a PR.

**Staging/QA:** A shared environment that mirrors production structure, populated with production-representative data. Staging runs the full dbt DAG (not just modified models) to validate that the change works end-to-end. Staging is optional for simpler pipelines but valuable for complex changes that affect core models.

**Production:** The live environment. Only changes that have passed CI and been code-reviewed reach production.

### Blue-Green Deployment for Breaking Changes

For schema changes that break backwards compatibility — removing a column, changing a primary key definition — a blue-green deployment minimises disruption:

1. Deploy the new schema alongside the old schema (new model name or new schema)

2. Migrate downstream dependencies to the new schema

3. Update any BI tool data sources to point to the new schema

4. Retire the old schema after all consumers have migrated

Without blue-green, breaking schema changes cause downstream failures the moment they deploy. Blue-green allows zero-downtime migration but requires more coordination.

dbt Cloud vs Custom CI/CD

**dbt Cloud:** Provides built-in CI/CD through the "Slim CI" job type — automatically runs on PR creation, uses dbt's state comparison for modified-only testing, and has native GitHub/GitLab integration. The simplest setup for teams using dbt Cloud.

**Custom GitHub Actions / GitLab CI:** More flexible, lower cost for high-frequency builds, can integrate with any cloud provider and any warehouse. Requires more setup time but is not significantly more complex once the template is established.

**The decision:** If you are already on dbt Cloud and are within the seat count where dbt Cloud pricing makes sense, use dbt Cloud's native CI/CD — it is well-designed and reduces operational overhead. If you are on dbt Core or if dbt Cloud costs are a concern, GitHub Actions with the slim CI pattern is the right approach.

Secrets and Credential Management

CI/CD pipelines need warehouse credentials to run. Best practices:

**GitHub Actions secrets:** Store warehouse credentials as encrypted repository secrets, not in the repository or workflow YAML files. Access secrets in workflows as environment variables.

**Service accounts, not personal accounts:** Use a dedicated service account for CI and production deployments. Personal accounts create dependency on individual people; service accounts are managed as infrastructure.

**Least privilege:** The CI service account needs read access to source schemas and write access to CI schemas. The production service account needs write access to production schemas. Neither needs admin access.

Our data engineering consulting practice implements production dbt deployments and CI/CD pipelines — contact us to discuss dbt pipeline architecture for your environment.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →