Most data governance programmes fail not because the framework is wrong but because it is not implemented. Here is a governance framework designed for practical implementation: ownership, definitions, quality standards, and access control — with the organisational structures that make them stick.
The quick answer
A data governance framework is the combination of policies, processes, organisational structures, and technical controls that ensure data is accurate, accessible, and used appropriately. Most governance programmes fail not because the framework is wrong but because it is too bureaucratic to implement, too theoretical to act on, or treated as a one-time project rather than an ongoing practice.
Governance that works has four components: ownership (who is accountable for each data asset), definitions (canonical meaning of key business concepts), quality standards (what "good data" looks like and how it is enforced), and access control (who can access what, with what audit trail). All four must be in place; none works without the others.
Why governance programmes fail
Understanding failure modes is the starting point for building governance that avoids them.
**Too much governance, too soon**: comprehensive governance frameworks document thousands of data assets, define elaborate stewardship hierarchies, and require approval workflows for every data access request. The result: the programme collapses under its own weight before it delivers value. Start with the highest-value, highest-risk data assets and expand coverage as the programme demonstrates value.
**No ownership**: governance without named owners is aspirational documentation. "The Data Team owns all data quality" is not ownership — it is diffusion of accountability. Every critical data asset must have a named business owner (a person, not a team) who is accountable for its quality and appropriate use.
**Governance as IT project**: governance is an organisational capability, not a technology deployment. Buying a data catalogue tool does not produce governance. The tool supports governance practices; it does not substitute for them. Organisations that frame governance as a tooling project typically produce well-documented data with no change in quality, access discipline, or organisational behaviour.
**Governance theatre**: producing governance documentation (data dictionaries, ownership registers, quality policies) without the enforcement mechanisms to make them real. Governance is effective when it produces observable change in how data is used and maintained, not when it produces documentation that accurately describes how data should be used.
**Confusing governance with compliance**: regulatory compliance requirements (GDPR, HIPAA, BCBS 239) create governance obligations, but compliance and governance are not the same thing. Compliance governance is the minimum required by regulation; effective governance extends beyond compliance to ensure data is trustworthy and analytically valuable, not just legally defensible.
The four components of effective data governance
### 1. Ownership
**Data ownership** is the assignment of accountability for a data asset to a named person (the data owner). The data owner is responsible for: defining how the asset should be used, approving access requests, setting quality standards for the asset, and resolving data quality issues when they arise.
Data ownership is distinct from data stewardship (a more operational role — managing the day-to-day quality monitoring and metadata maintenance) and data engineering ownership (who built the pipeline). Ownership is a business accountability, not a technical one. The finance team owns financial data; they do not own the pipeline that loads it.
For practical implementation: start with the highest-value data domains (Customer, Product, Revenue, Employee). For each domain, identify the business leader who would be most affected by poor data quality in that domain. That person — or their delegate — is the data owner.
Ownership must be formalised and visible. A data ownership register (maintained in the data catalogue or a shared document) that names the owner of each critical data asset, reviewed and updated quarterly.
### 2. Definitions
**Data definitions** establish canonical meanings for key business concepts. When the Finance team and the Sales team both calculate "revenue" differently, the problem is not data quality — the data is correct in both systems. The problem is semantic: the same word means different things to different people.
A data dictionary defines: the canonical name of each concept, its business definition (in plain language), its technical definition (how it is calculated), the source system it comes from, and the exceptions and caveats that apply.
The most important definitions are for the metrics and dimensions that appear in executive reporting and business decisions. Start with the 10–15 metrics that are most frequently disputed or most frequently produce incorrect analysis when stakeholders use them. Define each precisely. Publish the definitions where analysts can find them.
The **semantic layer** (dbt Semantic Layer, Cube, AtScale, or LookML) is the technical implementation of definitions: business logic defined once in code, used by all BI tools that query through the layer. The semantic layer is not a substitute for a data dictionary — the dictionary is human-readable; the semantic layer is machine-executable. They complement each other.
For the technical implementation, see what is a semantic layer.
### 3. Quality standards
**Quality standards** define what "good data" means for each critical data asset, and the controls that ensure the data meets those standards.
A data quality standard for a Customer table might specify: Customer ID must be unique and non-null; Customer Email must be in valid email format; Customer Status must be one of Active, Inactive, Prospect; Customer Created Date must not be in the future; Total Lifetime Value must be non-negative.
Quality standards are implemented as technical controls: dbt tests (not_null, unique, accepted_values, custom business-logic tests), ingestion-layer validation (schema validation, range checks on ingestion), anomaly detection (Monte Carlo, Bigeye for detecting statistical anomalies beyond explicit rules). See data quality management for the full technical implementation guide.
Quality standards also require **quality SLAs**: if a data quality failure is detected, how quickly must it be resolved? For financial reporting data: same-day resolution. For operational analytics: next-business-day resolution. Without SLAs, quality standards are aspirational — with SLAs, they create accountability.
**Certified content** is the BI governance mechanism: a certified data source or dashboard has been reviewed by the data owner, meets quality standards, and is endorsed for business decision-making. Tableau's certified data source feature, Power BI's endorsement feature, and Looker's content management system all support certified content workflows.
### 4. Access control
**Access control** governs who can access what data, under what conditions, and with what audit trail. In enterprise environments, access control serves multiple purposes: security (preventing unauthorised access), privacy (limiting access to personal data to those with a legitimate purpose), and regulatory compliance (documenting who accessed regulated data and when).
**Role-based access control (RBAC)** assigns permissions to roles, and roles to users. Rather than managing individual user permissions, you manage a smaller set of roles (Data Analyst - Sales Region, Finance Viewer, Data Engineer - Production) and assign users to roles. Role membership is reviewed and updated as team membership changes.
**Attribute-based access control (ABAC)** extends RBAC with contextual attributes: a user can access customer data, but only for customers in their assigned territory, during business hours. ABAC is more flexible than RBAC for complex permission structures but more complex to implement and audit.
**Column-level and row-level security**: for sensitive data fields (personal data, financial data, compensation data), column-level masking (dynamically masking sensitive columns for users without appropriate clearance) and row-level security (restricting rows visible to a user based on their attributes) provide granular data access control. Snowflake's dynamic data masking, BigQuery's column-level security, and Power BI's row-level security implement these controls.
**Access request and approval workflows**: ad-hoc access requests should go through a documented approval process — request, justification, data owner approval, provisioning, expiry. A data catalogue (Atlan, Collibra, Microsoft Purview) typically provides this workflow alongside the metadata management capability.
**Audit logging**: every data access should be logged — who accessed what data, when, and from which system. Audit logs are required for SOC 2, HIPAA, GDPR (for personal data access), and financial services regulations. Snowflake's QUERY_HISTORY, BigQuery's DATA_ACCESS audit logs, and Tableau's site activity logs provide the raw data; a SIEM or centralised log management system aggregates them.
Building the programme in phases
**Phase 1 (0–90 days)**: identify and document ownership for the top 10 most critical data assets. Define the top 10 most-disputed business metrics. Implement automated quality checks (dbt tests) on the critical path data. Establish a data governance council (monthly meeting, business owners of critical data domains, data team lead, IT security representative).
**Phase 2 (90 days – 6 months)**: expand data dictionary to cover all metrics in executive reporting. Implement access request workflow for sensitive data. Deploy a data catalogue tool (or configure dbt Docs as a lightweight starting point). Establish certified data source programme in the BI platform.
**Phase 3 (6–18 months)**: expand quality standards and automated testing to all data domains. Implement semantic layer for metric consistency. Establish data quality SLAs with data owners. Conduct first data governance audit.
**Ongoing**: quarterly data ownership reviews (are the named owners still current?), quarterly quality report to business stakeholders, annual governance programme review against business requirements.
Tooling
Data governance tools — data catalogues and governance platforms — support governance practices but do not create them. The right tool depends on environment complexity:
**Lightweight (dbt Docs + Notion/Confluence)**: for small data teams with a dbt-based architecture, dbt-generated documentation provides a lightweight data catalogue with lineage. Governance policies, data dictionaries, and ownership records can be maintained in Confluence or Notion. Appropriate for organisations with 1–5 data engineers and limited governance complexity.
**Mid-market**: Atlan (strong dbt and warehouse integration, user-friendly UI), Datahub (open-source, strong engineering community), Monte Carlo (adds observability alongside governance). Appropriate for organisations with 5–20 data team members and significant data asset volume.
**Enterprise**: Microsoft Purview (native for Azure/Microsoft environments, integrates with ADF, Synapse, Power BI), Collibra (established enterprise governance platform, strong workflow management), Alation (BI-focused governance, strong Tableau integration). Appropriate for large organisations with dedicated governance programmes and regulatory compliance requirements.
For the full context on data governance as part of the modern data architecture, see what is data governance and data lineage.
Our data architecture consulting practice designs governance frameworks as part of platform builds and as standalone governance programmes. If your governance programme is not sticking or you are starting a governance initiative, book a free 30-minute audit and we will tell you directly what is missing.
A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.
Book a Call →