BlogData Architecture

What Is Column-Level Security in a Data Warehouse?

Austin Duncan
Austin Duncan
Project Manager & Data Strategist
·April 19, 202810 min read

Column-level security restricts access to specific columns in a database or data warehouse based on a user's role or identity. This guide explains how column-level security works in Snowflake, BigQuery, and Databricks, the difference between masking and filtering, and when this level of access control is necessary.

Column-level security restricts access to specific columns in a database table based on a user's identity or role. A table may contain both publicly visible information (product name, price, category) and sensitive information (customer email, SSN, salary). Column-level security ensures that users or groups who need access to the table for some columns are prevented from seeing others — without requiring separate tables or duplicate data.

It is distinct from row-level security (which filters which rows a user can see) and table-level permissions (which control whether a user can query the table at all). Column-level security operates at the field level, enabling fine-grained control over which pieces of information are visible to different users.

Why Column-Level Security Matters

Modern data warehouses consolidate data from many source systems, often including sensitive information that must be governed carefully. A single customer table in a data warehouse may contain:

- Name, address, email — PII governed by GDPR, CCPA

- Date of birth — sensitive PII with additional regulatory implications

- Social Security Number — highly regulated, often should not be stored in analytics environments at all

- Transaction history and spending patterns — financial data with potential regulatory exposure

- Health indicators (for insurers or health plans) — PHI governed by HIPAA

Different users have different legitimate needs:

- A marketing analyst needs product preference and channel data but not SSN or salary

- An HR reporting team needs salary ranges and department data but not customer PII

- A compliance auditor needs access to a broader set for audit purposes

- A data scientist building a churn model needs behavior data but not payment card numbers

Without column-level security, the choices are: grant broad access (risky), create separate sanitized views for each use case (brittle, maintenance-intensive), or maintain separate tables with different data subsets (expensive duplication). Column-level security enables a clean, policy-driven solution.

Column-Level Security vs Column Masking

There are two related but distinct mechanisms:

**Column-level access control** — users without the required role or permission cannot query the column at all. Queries that reference the column return an error or the column is silently omitted from results (depending on implementation). The user knows the column exists but cannot see its values.

**Column masking** (also called dynamic data masking) — users see a transformed version of the column based on their role. A marketing analyst sees "user@example.com" for email; a compliance officer with higher access sees the real email "john.smith@company.com". The analyst's query returns results without error, but sensitive values are replaced with masked versions. The same query, different results based on who runs it.

Masking is generally preferred over pure access denial for operational usability: the table remains queryable, reports work without errors, and the security is transparent to the user. Access denial is appropriate when the column should not be visible in any form to certain roles (column names themselves are visible even without data access, which may be undesirable).

Snowflake Column-Level Security

Snowflake implements column-level security through dynamic data masking policies:

A masking policy is a schema-level object that defines transformation rules based on the current user's role. The policy is then applied to specific columns in specific tables or views:

CREATE MASKING POLICY email_mask AS

(val STRING) RETURNS STRING ->

CASE

WHEN CURRENT_ROLE() IN ('COMPLIANCE_ANALYST', 'SYSADMIN') THEN val

ELSE '****@****.***'

END;

ALTER TABLE customers

MODIFY COLUMN email

SET MASKING POLICY email_mask;

Users with the COMPLIANCE_ANALYST or SYSADMIN role see the real email. All other roles see the masked value. The policy is centrally managed — changing who can see unmasked data requires updating the policy, not re-granting permissions on every dependent view.

Snowflake also supports column-level tagging via object tagging and tag-based masking policies, which simplifies applying consistent masking rules to all columns tagged as PII across the warehouse.

BigQuery Column-Level Security

BigQuery implements column-level security through policy tags in the Data Catalog:

1. A policy tag taxonomy is created in Data Catalog (e.g., taxonomy: "PII Classification", tags: "Sensitive", "Highly Sensitive")

2. Policy tags are assigned to sensitive columns in BigQuery tables

3. IAM policy bindings on the policy tags control which users can see the column

Users without the "Fine-Grained Reader" IAM role on a policy tag see NULL when querying that column; they do not receive an error. Users with the role see the actual values.

BigQuery also supports dynamic data masking on top of policy tags, enabling role-based value transformation similar to Snowflake's masking policies.

Databricks and Unity Catalog

Unity Catalog (Databricks' unified governance layer) supports both column masking and row filtering with rich policy options:

Column masks are SQL expressions that define what different principals see. Row filters define which rows are visible to each principal. Both are applied at the table level and evaluated at query time using the current user's identity.

Unity Catalog policies integrate with external identity providers (Azure Active Directory, AWS IAM), enabling column-level security that maps directly to organizational identity groups rather than requiring manual role maintenance in the warehouse.

Row-Level Security vs Column-Level Security

These are separate mechanisms that address different access dimensions and are often used together:

**Row-level security** (RLS) restricts which rows a user can see in a query. A regional sales rep sees only rows where region = 'EMEA'. A manager sees all regions. RLS filters the dataset; column-level security filters the fields within each row.

Combined: a sales dashboard query returns only the rows the user's RLS policy allows, and within those rows, masks columns the user is not authorized to see in full. Both policies apply simultaneously.

In Snowflake, row access policies work alongside masking policies. In BigQuery, row-level security is implemented through authorized views or row access policies. In Databricks Unity Catalog, row filters and column masks are both managed through the same policy framework.

Implementing Column-Level Security in Practice

Governance tooling (Collibra, Alation, DataHub) can integrate with column-level security implementations to provide a discovery layer: data consumers can see that a column exists and what it contains in the abstract, request access through a governance workflow, and be granted the appropriate role when approved.

For compliance use cases (GDPR, CCPA, HIPAA, PCI DSS), column-level security is often a requirement, not a preference. Demonstrating to an auditor that PII columns are masked for users without explicit authorization is easier with a policy-driven system that applies consistently than with a patchwork of views and manually maintained permissions.

The practical recommendation: implement column-level security as part of warehouse setup, not as a retrofit. Retroactively applying masking policies to a warehouse where analysts have already built views and queries against unmasked columns requires coordination and communication. Starting with policies in place means the governance layer is part of the foundation rather than an obstacle added later.

Our data architecture practice designs and implements data warehouse security architectures including column-level and row-level security — contact us to discuss governance and access control requirements for your analytics environment.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →