A data product is a data asset managed and delivered with the same rigour as a software product — with a defined owner, quality SLAs, versioning, and documentation. It is the data mesh's answer to the data pipeline problem. Here is what a data product actually is and how to build one.
The quick answer
A data product is a data asset — a table, a dataset, an API, a report, a model — that is designed, built, and maintained with the rigour of a software product. It has a defined owner, quality guarantees, a versioning strategy, documentation that a consumer can use, and a support process when it breaks. The contrast is with a data pipeline output: a table that exists because a pipeline wrote to it, with no owner, no quality guarantee, no documentation, and no support.
The concept of the data product comes from the data mesh architecture (Zhamak Dehghani, 2019), where domain teams are responsible for making their data available to other teams as data products rather than as raw pipeline outputs. But data products are valuable regardless of whether you are implementing data mesh — they are a quality standard for data assets that makes them actually useful to the people who consume them.
What makes a data asset a data product
A data asset is a data product when it has:
**A named owner**: a specific person (not "the data team") who is accountable for the product's quality, availability, and evolution. The owner makes decisions about the product schema, accepts or rejects requests for changes, and is responsible when the product breaks.
**A defined consumer contract**: documentation that tells consumers what the product contains, what the schema is, what each field means, what the quality guarantees are (null rates, uniqueness, freshness), and what the SLAs are for updates and resolution of quality issues.
**Quality guarantees**: automated quality checks that verify the product meets its quality standards on every update. Not aspirational quality statements — technical controls that fail the pipeline when quality violations occur.
**Versioning**: a strategy for how the product evolves. Schema changes that would break consumers are managed through versioning: the old version continues to be available while the new version is adopted. Deprecation is communicated in advance with a migration path.
**Discoverability**: documentation that enables potential consumers to find the product and understand whether it serves their use case. A data catalogue entry, a README in the data platform, or a published schema with descriptions — enough that a consumer can evaluate the product without asking the owner.
**SLAs**: defined availability, freshness, and support SLAs. "This table is updated daily by 6am; critical quality failures are resolved within 4 hours; schema changes are communicated 2 weeks in advance." SLAs create accountability; without them, a data product is just a named pipeline output.
What data products are not
**A data product is not the same as a data asset**: all data products are data assets, but most data assets are not data products. The difference is intentional design, ownership, and the consumer contract.
**A data product is not a report or a dashboard**: reports and dashboards are delivery mechanisms. A data product is the governed, documented, quality-assured data layer that reports and dashboards consume. A report built directly on a raw source table without a documented, governed intermediate layer is not consuming a data product.
**A data product is not a BI tool feature**: some BI tool vendors use "data product" loosely to describe features like certified data sources or reusable datasets. These are components of a data product approach, not data products themselves.
**A data product does not require data mesh**: data mesh is an organisational model where domain teams own and publish data products. The product quality standard — ownership, contract, quality, versioning — is valuable independently of whether your organisation implements data mesh.
Types of data products
**Dataset/table data products**: a governed, documented, quality-assured table or set of tables in the data platform. The Customer Gold table, the MRR table, the Orders fact table — when these are designed with ownership, quality guarantees, schema documentation, and SLAs, they are data products.
**API data products**: data served via an HTTP API rather than a direct database query. Useful when consumers are application developers who need programmatic access, when data access should be abstracted from the underlying storage technology, or when usage-based metering is required.
**Embedded analytics data products**: a Tableau workbook or Power BI dataset published as a certified, governed product with defined SLAs and a change management process. Embedded analytics in applications — where the analytics are part of the product experience rather than internal BI — are naturally data products.
**ML feature data products**: feature store features that are published for use by multiple ML models. The offline store table and online store serving layer, combined with documentation of what the feature represents and its quality guarantees, constitute a data product for ML consumers.
**Report/dashboard data products**: dashboards designed and maintained to a product standard — certified, versioned, with defined update schedules and support SLAs. The standard executive dashboard or the certified operational dashboard that a business function depends on daily.
Building data products: the minimum viable approach
You do not need a formal data mesh implementation or a new governance platform to start building data products. The minimum viable data product approach:
**Step 1: Identify your highest-value data assets.** Which tables or datasets are most frequently queried by the most people? Which are critical path for key business processes? These are the candidates for data product treatment.
**Step 2: Assign owners.** For each high-value asset, identify the business function that is most knowledgeable about and most dependent on it. Assign ownership to a named person in that function. Define what ownership means: reviewing schema change requests, approving access, and being notified when quality issues occur.
**Step 3: Document the consumer contract.** Write a data dictionary entry for each data product: what does this asset contain, what does each field mean, what are the quality standards, what are the update SLAs. Publish this documentation where consumers can find it — in the data catalogue, in the data platform's README, or in a shared Confluence page.
**Step 4: Implement automated quality checks.** For each data product, implement dbt tests (not_null, unique, accepted_values, referential integrity) that run on every update and fail loudly if quality standards are violated. The data product owner is notified on failure and is responsible for resolution.
**Step 5: Establish a change management process.** Define how schema changes are communicated to consumers. At minimum: a Slack channel or email distribution list that notifies consumers of upcoming changes, a minimum notice period (2 weeks for breaking changes), and a versioning strategy for changes that cannot be backwards-compatible.
Data products in the data mesh context
In a data mesh organisation, data products are the mechanism by which domain teams share data with other domains. Each domain owns the data products it publishes. The data platform team provides the infrastructure (self-serve platform) that makes publishing and consuming data products easy.
Data products in a data mesh must be discoverable (findable via a shared catalogue), addressable (accessible via a consistent addressing scheme — typically a stable URL or qualified table name), trustworthy (quality guarantees enforced by the producing domain), self-describing (documentation embedded in the product, not in a separate wiki), and interoperable (conforming to the platform's conventions so consumers can work with all products consistently).
The distinction from hub-and-spoke: in hub-and-spoke, a central data team integrates all data. In data mesh, each domain integrates its own data into a data product and publishes it. Central governance defines the standards; domain teams execute them. For the full data mesh context, see data mesh architecture.
Common mistakes
**Declaring data products without the governance**: labelling existing tables as "data products" without implementing ownership, quality guarantees, or documentation produces a data catalogue with impressive-sounding names and no improvement in data quality or reliability.
**Too many data products at once**: attempting to elevate all 500 tables to data product status simultaneously is operationally impossible. Start with the 10–15 most critical assets and build the governance infrastructure with them. Expand coverage as the programme demonstrates value.
**Ownership without accountability**: naming an owner without defining what ownership means and creating the structures that make ownership viable. If the owner has no visibility into quality failures, no mechanism for approving access requests, and no authority over the schema, ownership is nominal rather than real.
**Ignoring the consumer**: data products exist to serve consumers. Building data products by looking inward (what data do we produce?) rather than outward (what data do consumers need?) produces products no one wants.
For the governance framework that supports data products — ownership, definitions, quality standards, access control — see how to build a data governance framework. For the data mesh context, see data mesh architecture. For the semantic layer that provides the canonical metric definitions a data product exposes, see what is a semantic layer.
Our data architecture consulting practice designs data product frameworks for mid-market and enterprise organisations — from the governance structures to the technical implementation. If you are building a data product programme or want to elevate specific high-value data assets, book a free 30-minute audit.
A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.
Book a Call →