The analytics engineer is a relatively new data team role that sits between data engineering and data analysis. This guide explains what analytics engineers do, why the role emerged, how it differs from data engineers and analysts, and the tools that define analytics engineering in practice.
The analytics engineer is a relatively new data team role that emerged from the intersection of data engineering capabilities and data analysis domain knowledge. Formalized largely by the dbt community around 2018–2019, the role addresses a specific gap that became apparent as modern data stacks matured: someone needed to own the transformation layer between raw ingested data and the clean analytical tables that analysts and BI tools consume.
How the Role Emerged
Before the modern data stack (Fivetran + dbt + cloud warehouse), data transformation was handled either by data engineers (who wrote Python and SQL transformation jobs) or by analysts (who wrote complex SQL directly against source tables). Neither arrangement worked well.
Data engineers owned the pipelines but often lacked the domain knowledge to encode business logic correctly — they built the infrastructure but needed the business team to specify what the transformation should do, and that specification was rarely precise enough to avoid misunderstandings.
Analysts owned the business logic but often lacked the engineering discipline to implement it reliably — their SQL was often ad-hoc, undocumented, not tested, and not reproducible. Metrics were encoded in spreadsheet formulas or embedded in Tableau calculated fields, not in a shared, governed transformation layer.
The analytics engineer role was defined to own this gap: someone who understands both the technical transformation layer and the business logic it should encode. dbt provided the tool that made this role productive — a SQL-first transformation framework where transformations are SELECT statements, documentation is colocated with models, and tests are first-class citizens.
What Analytics Engineers Do
**Build and maintain dbt models** — writing, testing, and documenting the SQL transformation models that produce the cleaned, business-logic-enriched tables that analysts and BI tools query. Analytics engineers own the dbt project: model naming conventions, folder structure, test coverage, documentation standards.
**Encode business logic** — working with domain stakeholders to understand how metrics are defined, what filters apply to revenue calculations, what constitutes a "qualified lead" or an "active user," and encoding those definitions in dbt models that all consumers of those metrics share.
**Maintain data quality** — writing and maintaining dbt tests that validate assumptions about the data: primary key uniqueness, referential integrity, expected value ranges, not-null constraints. When tests fail, analytics engineers investigate and fix.
**Build and document data sources** — defining how raw source tables are staged, what cleaning is applied, and what the data represents. The staging models analytics engineers write are the documented interface between raw source data and the transformation layer.
**Support analysts and BI developers** — analytics engineers are often the internal experts that analysts turn to when they need new data, when they suspect a model is wrong, or when they need to understand how a metric is calculated. They are the bridge between the engineering team that builds infrastructure and the analytical team that uses it.
**Evaluate and adopt tooling** — analytics engineers are often the team members most engaged with the evolution of the modern data stack. They evaluate new dbt features, consider new transformation approaches, and document conventions that the team should follow.
Analytics Engineer vs Data Engineer
The distinction is primarily in orientation and scope:
A **data engineer** is infrastructure-first. They build the pipelines that move data from sources to the warehouse, manage the orchestration and reliability of those pipelines, design the warehouse schema, and ensure data arrives with high reliability. They are typically strong in Python, cloud infrastructure (AWS, GCP, Azure), and systems thinking. They may be less focused on business domain knowledge.
An **analytics engineer** is transformation-first. They work inside the warehouse, not on the infrastructure around it. They write SQL, document models, build tests, and serve as the domain-knowledge bridge between the raw data that engineers land and the analytical tables that analysts use. They need strong SQL skills and business domain understanding, but less cloud infrastructure expertise.
In smaller teams, one person often does both. As teams grow, the roles specialize.
Analytics Engineer vs Data Analyst
The distinction here is between building and using:
A **data analyst** uses the clean, governed analytical tables that the analytics engineer builds. They write queries, build dashboards, produce analyses. Their primary output is insight and decision support for business stakeholders. They may write dbt models occasionally, but their primary work is analysis.
An **analytics engineer** builds and maintains the foundational data layer that analysts depend on. They are less focused on producing business insights directly and more focused on ensuring the analytical foundation is clean, tested, documented, and trustworthy. Their primary output is the governed transformation layer.
The boundary between these roles is frequently blurry in smaller teams, where an analyst may maintain some dbt models. The distinction sharpens as teams grow.
The dbt Ecosystem and Analytics Engineering
dbt is the tool most associated with analytics engineering, to the point that "analytics engineer" and "dbt practitioner" are nearly synonymous in practice.
dbt Core is the open-source framework. Key concepts:
- **Models**: SQL SELECT statements that define a transformed table or view
- **Sources**: references to raw tables in the warehouse, with freshness tests
- **Tests**: assertions about model outputs — unique, not_null, accepted_values, relationships
- **Documentation**: YAML schema files with descriptions for every model and column
- **Snapshots**: dbt's implementation of SCD Type 2 for slowly changing dimensions
dbt Cloud adds scheduling, CI/CD integration (running tests before merging changes), a metadata API, and the Explorer interface for lineage and column-level documentation.
Analytics engineers who are expert in dbt have become valuable across the industry — the skill is portable across organizations running any cloud warehouse.
Our data architecture practice includes analytics engineering work — designing and building dbt transformation layers, establishing testing and documentation standards, and building the governed analytical foundation that BI tools and analysts depend on. Contact us to discuss your transformation layer requirements.
A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.
Book a Call →