BlogBusiness Intelligence

Embedded Analytics Patterns: Architecture Decisions for Product Teams

Eric Chen
Eric Chen
Senior BI Solutions Architect
·July 2, 202711 min read

Embedding analytics in a product — giving customers visibility into their data inside your application — is a strategic capability that differentiates SaaS products. But building embedded analytics well requires architectural decisions that are easy to get wrong: multi-tenancy, data isolation, performance under concurrent usage, and licensing. This guide covers the patterns that work.

Embedded analytics — placing analytical capabilities inside a product so that customers see their data within the application context — has become a standard expectation for SaaS products. Customers expect to see usage metrics, performance data, financial summaries, and operational insights in the product they are already using, not in a separate analytics portal they have to navigate to.

Building embedded analytics well requires architectural decisions that are not obvious and failure modes that are not immediately visible. The decisions made at the start of an embedded analytics implementation shape the product's performance, security, and scalability for years.

The Multi-Tenancy Isolation Problem

The most critical architectural decision in embedded analytics is data isolation: ensuring that Customer A cannot see Customer B's data, even through partial exposure (slow queries, API timing variations, error messages that reveal data about other tenants).

**Logical isolation via row-level security**: The simplest approach — all customers' data in the same database, with row-level filters applied based on the authenticated customer identity. A single row-level security policy filters every query to the rows belonging to the requesting customer. This works reliably when implemented correctly at the data source level (not the BI tool level) and is cost-efficient because it requires no per-customer infrastructure.

The risk: a bug in the row-level security logic can expose one customer's data to another. Test RLS implementations rigorously, including negative tests that verify that a customer with valid credentials cannot access another customer's rows by any means (direct query, API, BI tool bypass).

**Physical isolation via separate schemas or databases**: Each customer's data lives in a separate schema or separate database. The BI tool or analytics layer connects to the appropriate schema for each customer. Physical isolation eliminates RLS logic bugs as a cross-tenant exposure vector.

The cost: schema proliferation creates operational overhead. With thousands of customers, each with a separate schema, schema management (onboarding, migrations, monitoring) becomes a significant engineering burden. Physical isolation is appropriate for high-security requirements (financial data, healthcare data) where the operational overhead is justified.

**Separate database instances per customer**: Each customer has a fully isolated database instance. Maximum isolation, maximum cost and operational overhead. Appropriate only for the largest enterprise customers or those with strict regulatory isolation requirements.

Most mid-market embedded analytics implementations use logical isolation with row-level security, with the RLS implementation at the data source level (not the BI tool level). The BI tool connects to a shared data source; the data source applies customer-specific filters before any data is returned to the BI layer.

Authentication and Session Management

Embedded analytics authentication must be seamless — users authenticated in the host application should not be prompted to authenticate again for the analytics component.

The standard pattern is JWT-based single sign-on:

1. The host application's backend authenticates the user and determines their customer ID and permissions

2. The backend signs a JWT containing the user's identity and the appropriate data access context

3. The JWT is passed to the embedded analytics component (as a query parameter, header, or through the SDK)

4. The analytics component validates the JWT and creates an analytics session with the appropriate data access context

For Tableau embedding, this is the Connected Apps JWT pattern. For Power BI embedding, this is the embed token pattern. For Looker, this is the signed embed URL with user attributes. For custom-built analytics, the JWT is validated directly by the analytics API.

Critical implementation requirement: the JWT must be signed by the backend with a secret that is never exposed to the client. A client-side signed JWT allows any user to claim any identity and access any customer's data. Backend-signed JWTs are the only secure implementation.

Performance at Scale

Embedded analytics under concurrent usage from hundreds or thousands of customers simultaneously has different performance characteristics than internal BI usage.

**Query concurrency**: Each customer session generates queries. If 500 customers are actively viewing their analytics simultaneously, the analytics layer must serve 500+ concurrent query sessions. Design the warehouse and analytics layer for this concurrency from the start; retrofitting concurrent query capacity is more expensive than designing for it initially.

**Caching strategy**: For dashboards that all customers see a version of (a usage overview with the same structure, different data per customer), the underlying query patterns are often similar across customers. Caching query results at the customer level (not the global level) reduces warehouse load. Results for Customer A's monthly summary are cached; when the next customer requests their monthly summary, the same query runs but returns different data — customer-specific cache rather than global cache.

**Pre-aggregation**: Dashboards that display aggregated metrics (total events last 30 days, revenue this month, DAU trend) can be served from pre-computed aggregations rather than recomputing from raw events on every view. A background job that computes the key metrics for each customer and writes them to an aggregation table reduces query-time compute significantly for high-traffic dashboards.

**Connection pooling**: Analytics queries require database connections. With thousands of customers, connection pooling is essential — a pool of database connections shared across customer sessions rather than each session holding a dedicated connection. PgBouncer for PostgreSQL, Snowflake's own connection handling, or a middleware pooling layer provide this capability.

Licensing Models for Embedded Analytics

Embedded analytics for customers is not the same as internal enterprise BI in its licensing model. BI vendors have specific licensing for embedded/OEM use:

**Tableau**: Tableau's embedded analytics licence (separate from Enterprise licences) is designed for ISVs and SaaS products. Pricing is typically per customer account or per end-user count rather than per named licence. Requires a Salesforce sales conversation for pricing; not available through standard online purchasing.

**Power BI**: Power BI Embedded (A SKU) provides API-based embedding without requiring end users to have Power BI Pro licences. Pricing is based on capacity (A1-A8 SKUs) rather than per user. Appropriate for embedded customer analytics in SaaS products.

**Looker**: Looker's embedded analytics is typically priced per customer block or per active user, separate from the standard Looker Enterprise pricing. Looker's embed API is strong for programmatic, developer-driven embedding.

**Custom-built on Metabase, Superset, or similar open-source tools**: Some SaaS products build embedded analytics on open-source BI tools with custom multi-tenancy implementations. This avoids BI vendor licensing costs but requires significant engineering investment to build production-quality multi-tenancy, authentication, and performance.

Our Tableau consulting and BI strategy practice designs embedded analytics implementations — authentication, multi-tenancy, performance — contact us to discuss building embedded analytics into your product.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →