How Power BI works under the hood — the dataset, report, and dashboard objects, the difference between Import and DirectQuery storage modes, how the Analysis Services engine powers Power BI datasets, Premium capacity vs Pro licensing, and the architecture decisions that determine whether your Power BI deployment scales.
Power BI is architecturally more complex than most users realise. Understanding how its components work — and how they interact — determines whether a Power BI deployment scales, performs well, and is manageable at enterprise level. This guide covers the architecture from dataset engine to delivery layer.
The Three Object Types
Power BI has three primary content objects:
**Dataset:** The data model. Contains the data (in Import mode) or the connection definition (in DirectQuery mode), the relationships between tables, measures written in DAX, calculated columns, and row-level security rules. The dataset is the analytical engine of Power BI. Performance is almost entirely determined by dataset design.
**Report:** The visualisation layer. A report connects to a dataset and arranges visuals — charts, tables, maps, cards — across pages. Reports do not contain data; they query the dataset. Multiple reports can connect to the same dataset (called a shared dataset or certified dataset in Power BI service).
**Dashboard:** A collection of tiles pinned from reports. Dashboards are for high-level monitoring; reports are for interactive analysis. Dashboards are simpler than reports — they cannot have slicers or complex interactivity.
The separation of dataset from report is architecturally significant. A well-designed shared dataset becomes the certified source of truth for an organisation's metrics. All reports referencing the shared dataset get consistent metric definitions automatically. This is analogous to a certified data source in Tableau or a LookML view in Looker.
Storage Modes
The storage mode of a dataset determines where data lives and how queries are served.
### Import Mode
Data is extracted from source systems and stored in Power BI's in-memory column store (VertiPaq). Queries run entirely against the in-memory engine — no source system involvement during query execution. This is the fastest query execution option.
**Advantages:** Fastest query performance; queries do not put load on source systems; works offline.
**Limitations:** Data is a snapshot as of the last refresh. Refresh must be scheduled or triggered manually. Dataset size is limited (1GB per dataset on Pro/Premium Per User; up to 25GB on Premium capacity with large dataset storage enabled). The refresh pipeline can be slow for large tables.
**When to use:** Most analytical use cases where sub-minute latency is acceptable. The majority of Power BI deployments should default to Import mode.
### DirectQuery Mode
Each visual query translates to a query sent directly to the source database. No data is stored in Power BI.
**Advantages:** Always current data; no size limits; no refresh needed.
**Limitations:** Query performance depends on source system performance and concurrency limits. DAX measure complexity is constrained (many DAX functions are not available in DirectQuery). Complex reports with many visuals generate many database queries per page load, potentially overloading the source. User experience is slower than Import mode.
**When to use:** When data must be real-time or near-real-time, when the dataset would be too large for Import mode, or when the source system has the capacity to handle the query load.
### Composite Mode
A dataset can mix Import and DirectQuery tables. Frequently-queried reference data (date tables, product dimensions) is imported for performance; the large fact table with real-time requirements uses DirectQuery. Power BI handles query routing automatically.
Composite mode is powerful but requires understanding of aggregation tables and query folding. For complex analytical models, composite mode is often the right architecture when Import mode alone cannot meet latency or size requirements.
### Dual Mode
Tables set to Dual mode act as Import or DirectQuery depending on how they are queried. When queried alongside an Import table, they behave as Import; when queried alongside a DirectQuery table, they behave as DirectQuery. Used for dimension tables in composite models to optimise query performance.
The Analysis Services Engine
Power BI datasets are powered by Analysis Services — Microsoft's mature OLAP engine that also underlies SQL Server Analysis Services (SSAS) and Azure Analysis Services. The Power BI VertiPaq engine is the in-memory column store from Analysis Services Tabular.
This matters for several reasons:
**DAX is the same language across all Analysis Services products.** Knowledge of DAX transfers between Power BI, SSAS, and Azure Analysis Services.
**XMLA endpoint:** Premium capacity datasets expose an XMLA endpoint, allowing external tools (SQL Server Management Studio, Tabular Editor, ALM Toolkit) to connect to the dataset directly for model management, query profiling, and deployment automation. The XMLA endpoint is essential for enterprise-scale Power BI development.
**Tabular Model Definition Language (TMDL):** The dataset model can be exported and version-controlled in TMDL format — a human-readable representation of the tabular model. Tabular Editor supports TMDL, enabling source-control-based development workflows for Power BI datasets.
Licensing Architecture: Pro vs Premium
Power BI has two licensing tiers that determine what infrastructure the datasets run on.
### Power BI Pro
The standard licence. Datasets run on shared Microsoft infrastructure. Limitations:
- 1GB per dataset maximum
- 8 refreshes per day
- No XMLA endpoint access
- No large dataset storage
- Concurrent user performance depends on shared infrastructure load
Pro is appropriate for teams under 100 users with datasets under 1GB and refresh requirements of no more than 8 times per day.
### Premium Capacity
An organisational capacity (P-SKU or EM-SKU) that provides dedicated infrastructure. Users viewing content on Premium capacity need only a free Power BI licence — they do not need Pro. Key capabilities:
- Large dataset storage (up to 400GB per dataset or 25GB standard)
- Unlimited refreshes per day (via enhanced refresh API)
- XMLA endpoint for external tool access
- AI features (dataflows with AI enrichment, AutoML)
- Deployment pipelines for dev/test/prod promotion
- Paginated reports for operational printing
Premium capacity is the enterprise option. The cost model changes: instead of per-user Pro licences, you pay for reserved compute capacity that all users share.
### Premium Per User (PPU)
A hybrid option: per-user Premium features without purchasing organisational capacity. Each user pays for PPU individually. Appropriate for smaller teams that need Premium features (XMLA, deployment pipelines, large datasets) but do not have enough users to justify an organisational P-SKU.
Dataflows
Dataflows are a Power BI-native ETL layer. They connect to sources, apply transformations using Power Query, and store the results as entities (tables) in Azure Data Lake Storage. Datasets can then connect to dataflows rather than source systems directly.
Dataflows address a common Power BI governance problem: when every dataset has its own Power Query transformations connecting directly to source systems, the same transformations are duplicated across many datasets. A change to a business logic calculation requires updating every dataset individually.
With dataflows, shared transformation logic lives in the dataflow. Multiple datasets connect to the same dataflow entity and get consistent, maintained transformations. This is similar to the role dbt plays in the modern data stack — centralising transformation logic.
Dataflows are available on Pro; enhanced features (computed entities, linked entities, DirectQuery connections to dataflows) require Premium.
The Gateway
On-premise data sources require the Power BI Gateway — software installed on an on-premise server that proxies requests between Power BI service and on-premise systems. The gateway handles authentication, query execution against local systems, and returning results to Power BI.
Gateway configuration matters for reliability. A single gateway is a single point of failure for all datasets using on-premise sources. Enterprise deployments use gateway clusters — multiple gateway installations that load-balance and fail over automatically.
For modern data architectures that have migrated data to cloud warehouses, gateway dependency can be eliminated — direct cloud-to-cloud connectivity between Power BI service and Snowflake/BigQuery/Redshift requires no gateway.
Deployment Architecture
Enterprise Power BI deployments should mirror software development best practices:
**Workspace tiers:** Separate Development, Test, and Production workspaces. Dataset changes move through the pipeline: developed in Dev, tested in Test (against production-representative data), promoted to Production. Premium's Deployment Pipelines feature automates this promotion.
**Service principal automation:** Dataset refreshes and deployment pipeline operations should be automated using service principals (Azure AD applications), not personal user accounts. Service principals do not expire or require password rotation and are not affected by user departures.
**Sensitivity labels:** Microsoft Information Protection sensitivity labels can be applied to Power BI datasets and reports, enforcing encryption and access restrictions for confidential content in downstream exports.
Our BI consulting practice designs enterprise Power BI architectures — contact us to discuss Power BI architecture for your organisation.
A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.
Book a Call →