How to Structure a Data Team: Models, Roles, and the Organisational Trade-Offs

How you organise your data team determines what gets built, how fast, and how well it serves the business. Here is the practical guide to data team structure models and when each is appropriate.

The quick answer

There is no single correct way to structure a data team — there are three dominant models (centralised, federated, embedded) and each is appropriate for different organisation sizes and stages. The centralised model works for most companies under 500 employees and for those at any size where data quality and consistency are primary concerns. The federated model scales better at large organisations with multiple business units but requires significant governance investment. The embedded model provides the fastest domain responsiveness but loses the economies of scale and quality consistency of centralisation. Most organisations find themselves between models, and the structure should evolve as the organisation grows.

The centralised model

In the centralised model, all data engineers, analytics engineers, and data analysts report to a single data team (often under a VP of Data, Chief Data Officer, or CTO). The team owns the data platform, builds and maintains pipelines, designs the data model, and produces analytics for the business.

**How it works**: business stakeholders submit requests to the data team. The team prioritises, estimates, and delivers. The data team owns the infrastructure and the data quality; the business teams consume outputs.

**Advantages**: strong data quality and consistency (one team, one set of standards); economies of scale (one platform, one set of tools, shared expertise); clear ownership (every data problem has a clear owner); efficient use of senior expertise (one senior architect serves the whole company).

**Disadvantages**: bottleneck at scale (as the company grows, every new business unit's data request goes through the same central team); domain expertise gap (the data team may not deeply understand every business domain they support — a central analyst covering both Finance and Product is unlikely to be expert in either); context-switching cost (engineers context-switching between domain work are less productive than those who specialise).

**Best for**: companies under 500 employees; organisations where data quality and governance are higher priorities than speed; organisations with a small number of well-defined analytical use cases.

The federated (hub-and-spoke) model

In the federated model, a central data platform team (the hub) owns the infrastructure, tooling, and governance standards. Domain-aligned data analysts or analytics engineers (the spokes) are embedded in or closely aligned with business units (Finance, Product, Marketing, Customer Success) and report to those units or to a matrix structure.

**How it works**: the platform team builds and operates the data warehouse, dbt project, Airflow environment, and governance framework. Domain data people build domain-specific models, dashboards, and analyses using the platform the central team provides. The central team sets standards (dbt model naming, testing requirements, certified data source requirements); domain teams work within those standards.

**Advantages**: domain alignment (the Product data person understands the product deeply and can answer analytical questions that would take a central team weeks to scope); scale (adding a new domain does not linearly increase the central team's workload, because domain teams handle domain-specific work); platform leverage (the central team's platform investment serves all domains without duplication).

**Disadvantages**: governance complexity (domain teams working independently from the platform create risk of inconsistent metrics, shadow data marts, and undocumented pipelines without strong governance); cross-domain work falls in the gap (analytics that span Finance and Product has no natural owner); quality variance (domain teams without strong data engineering backgrounds may produce lower-quality models than a centralised team).

**Best for**: companies over 500 employees with multiple distinct business domains; organisations with strong analytics engineers in domain teams; organisations investing in a data governance framework to prevent fragmentation.

The embedded model

In the embedded model, data people (analysts, analytics engineers, or data engineers) report directly into business units — not to a central data function. There is no central data team; each business unit owns its own data infrastructure and analytics.

**How it works**: the Finance team has its own data engineer and analyst. The Product team has its own. Engineering has its own. Each team builds what it needs.

**Advantages**: maximum domain alignment and responsiveness; no central bottleneck; domain teams move at their own pace.

**Disadvantages**: no economies of scale (each team builds its own pipelines, models, and tools); significant data consistency risk (Finance and Product define "revenue" differently; nobody reconciles them); no platform investment (each team builds ad-hoc rather than investing in shared infrastructure); loss of senior expertise leverage (five domain teams each need a senior data engineer rather than sharing one).

**Best for**: very large organisations where domain independence is a primary value; or organisations in early stages before a central data function makes sense (individual contributors embedded in teams before a data team exists).

The common evolution path

Most companies follow a predictable evolution:

**Stage 1 (0–50 employees)**: individual data people embedded in teams, no central function. Each team fends for itself.

**Stage 2 (50–200 employees)**: first data hire or small central team (2–5 people). Centralised, building the foundation — one warehouse, one BI tool, shared metrics.

**Stage 3 (200–500 employees)**: centralised team grows (5–15 people), starts to feel bottleneck pressure. Domain-aligned analytics roles appear (Product analyst, Marketing analyst) but still report to central team or work very closely with it.

**Stage 4 (500–2,000 employees)**: federated model becomes necessary. Platform team separates from domain analytics teams. Governance investment required to maintain consistency.

**Stage 5 (2,000+ employees)**: large organisations often have multiple federated domain data teams, a central platform team, and a data governance function. Some approach data mesh — see data mesh architecture.

Key roles and career paths

**Data Analyst**: business-facing analytics, SQL, BI tools. Entry point to the data function. Career paths: Senior Analyst → Analytics Manager → Director of Analytics.

**Analytics Engineer**: the bridge between data engineering and analysis. Builds and maintains dbt models, the transformation layer, and certified data sources. Increasingly the core role in modern data teams. Career path: Analytics Engineer → Senior Analytics Engineer → Staff Analytics Engineer.

**Data Engineer**: builds and maintains the data platform — ingestion, orchestration, platform infrastructure. More engineering-heavy than analytics engineering. Career path: Data Engineer → Senior Data Engineer → Staff/Principal Data Engineer → Data Architect.

**Data Architect**: designs the overall platform architecture, makes vendor and tool decisions, defines governance standards, mentors the team. Typically 8–12 years of combined experience. See how to become a data architect.

**Head of Data / VP of Data**: leads the data function. Combination of technical depth (can evaluate architectural decisions) and organisational influence (can advocate for data investment at the leadership level).

Hiring sequence for a new data team

The most common mistake: hiring data analysts before having the infrastructure to support them. Data analysts without reliable, trusted data spend their time cleaning data and answering "where is this from" questions rather than doing analysis.

Recommended hire sequence: (1) Data Engineer or Analytics Engineer to build the foundation (warehouse, ingestion, basic transformation layer); (2) Senior Data Analyst once the foundation has reliable data; (3) Second data engineer or analytics engineer as volume grows; (4) Head of Data once the team reaches 4–5 people and needs management and strategy ownership.

For the governance framework that supports the team at scale, see data governance framework. For the platform architecture the team builds, see enterprise-data-platform-architecture. For the career paths within the data team, see how to become a data architect.

Our data architecture consulting and BI strategy consulting practices advise organisations on data team structure and hiring — from first data hire decisions through federated model governance design. Book a free 30-minute audit to discuss your data organisation.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →