BlogData Architecture

Customer 360: Building a Unified View of Your Customer in the Data Warehouse

Obed Tsimi
Obed Tsimi
Founder & Senior Tableau Architect
·December 6, 202611 min read

How to build a true Customer 360 — the identity resolution architecture that merges fragmented customer identifiers across CRM, product, billing, and support systems into a single entity, the data model that makes it analytically useful, and the business use cases it unlocks.

A Customer 360 is the idea of having a single, unified record for each customer that aggregates all the information the organisation holds about them — from CRM interactions to product usage, billing history, and support contacts. Every organisation claims to want one. Very few successfully build one.

The failure is almost never the BI layer. It is the identity resolution problem: a customer appears as five different records across six systems, each with slightly different identifiers, spellings, or formats. Without resolving these into a single entity, a "customer 360" is actually five separate views of the same customer, none of which is complete.

The Identity Resolution Problem

A mid-sized SaaS company might have the same customer recorded as:

- Salesforce CRM: account_id = ACC-1234, company name = "Acme Corp", contact email = "sarah@acme.com"

- Product database: user_id = 8842, email = "sarah@acme.co", company_id = null

- Billing system: customer_id = cus_abc123, email = "billing@acme.com", company = "Acme Corporation"

- Support platform: org_id = 9921, domain = "acme.com"

- Marketing automation: contact_id = C-557822, email = "s.smith@acme.com"

Five records. Three email addresses. Two company name spellings. No shared identifier across all systems.

Identity resolution is the process of determining which records across these systems refer to the same real-world entity and creating a unified identifier for them.

**Deterministic matching** links records that share an exact match on a high-confidence identifier — the same email address, the same domain, the same tax ID, or a shared internal identifier. Deterministic matching is fast and precise but only resolves records where the shared identifier is present and consistent.

**Probabilistic matching** links records that are probably the same entity based on a combination of similar (but not identical) attributes — fuzzy name matching, overlapping domain, proximity in signup date and geography. Probabilistic matching extends coverage but introduces false positives.

A practical identity resolution pipeline:

1. Normalise all identifiers: lowercase emails, trim whitespace, standardise phone number formats, strip LLC/Inc/Corp suffixes from company names.

2. Deterministic pass: link records sharing the exact same normalised email address, the same verified domain (where domain ownership is confirmed), or the same billing tax ID.

3. Probabilistic pass: for records that were not linked deterministically, apply a similarity score across name, domain, and geography. Records above the confidence threshold are linked; records below are flagged for manual review.

4. Generate a golden record ID: for each linked cluster, assign a canonical customer identifier (customer_key) that all downstream systems and models use.

5. Store the linkage: maintain a mapping table (source_system, source_record_id, customer_key) that records which source records map to which golden entity.

The Customer 360 Data Model

Once entities are resolved, the Customer 360 data model organises all available customer data into a unified structure. The canonical model:

**dim_customer (the golden record):** One row per resolved customer entity. Contains the canonical attributes: the agreed name, primary email, primary domain, industry, company size, geographic location, account tier. All other tables join to this record via customer_key.

**fact_customer_interactions:** One row per interaction event — CRM activities, support tickets, marketing email opens, web sessions, product usage events. Each event has: customer_key, interaction_type, interaction_channel, interaction_timestamp, and relevant attributes. This is the event timeline that supports contact history analysis.

**fact_customer_financials:** One row per billing period per customer — contract value, MRR/ARR, invoice amount, payment status, churn events, expansion events. This is the financial relationship history.

**fact_product_usage:** One row per usage event or usage session. Feature usage frequency, login events, feature adoption, API call volume — whatever the product captures. Joined to dim_customer for cohort analysis and adoption scoring.

**dim_customer_relationships:** Junction table capturing relationships between customers (parent company to subsidiary, partner to end customer, referral source) where these exist.

Business Use Cases

**Customer health scoring.** A health score aggregates product usage, support interactions, payment history, and contract status into a single signal. A customer who has logged in less than once per week in the last 30 days, submitted three support tickets in the last month, and has a contract renewal in 60 days is a churn risk. The Customer 360 provides all the inputs; the health score is the synthesis.

**Expansion identification.** Customers who are heavily using a feature that is gated at a higher tier, who have multiple departments using the product independently (multiple contact domains without a consolidated account), or who have grown in headcount since contract signing are expansion candidates. The Customer 360 surfaces these signals.

**Customer segmentation.** Combining firmographic attributes (industry, size, geography), behavioural attributes (product usage intensity, feature adoption depth), and financial attributes (ARR, growth rate, payment reliability) enables multidimensional segmentation — not just "enterprise vs SMB" but analytical segments that drive differentiated treatment.

**Churn analysis.** With complete interaction, usage, and financial history in one model, retrospective churn analysis becomes possible: which behaviours reliably preceded churn, how far in advance were those signals detectable, and what interventions correlated with retention. This requires historical records — SCD Type 2 snapshots of the dim_customer attributes that change over time.

**Attribution and customer journey analysis.** Connecting marketing touchpoints (from the marketing analytics stack) to CRM interactions and product adoption events via a shared customer identifier creates a complete pre- and post-acquisition journey. The Customer 360 provides the identity resolution that makes this possible.

Data Quality Requirements

A Customer 360 is only as good as its identity resolution. Incomplete resolution (missed matches) produces multiple fragmented records per customer. Incorrect resolution (false matches) merges different customers into a single record — a data quality failure that propagates to every downstream analysis.

The most important data quality test for a Customer 360: check for duplicate customer_key assignments (one source record mapped to multiple golden records — indicates a resolution error) and check for dangling source records (source records with no golden record mapping — indicates incomplete resolution).

Monitor resolution quality over time: track the percentage of source records that are deterministically resolved, probabilistically resolved, and unresolved. Track the false match rate (where human review or downstream signals indicate records were incorrectly merged). A degrading resolution rate indicates that a new source system or a source system change is not being handled by the resolution pipeline.

Our data architecture consulting practice designs customer data models and identity resolution architectures — contact us to discuss your Customer 360 requirements.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →