How the Power BI On-Premises Data Gateway connects Power BI Service to on-premises and private network data sources — deployment architecture, clustering for high availability, performance tuning, and the operational practices that prevent gateway from becoming a reliability bottleneck.
The Power BI On-Premises Data Gateway is the bridge between Power BI Service in the cloud and data sources that are not publicly accessible — on-premises databases, private network resources, or cloud services that require private connectivity. Without the gateway, Power BI Service cannot reach any data source that is not accessible from the public internet.
For enterprise Power BI deployments, the gateway is a critical piece of infrastructure that most organisations underinvest in until it becomes a reliability problem. This guide covers gateway architecture, clustering for high availability, performance configuration, and the operational practices that keep the gateway from becoming the weakest link in your analytics stack.
Gateway Architecture
The gateway is installed on a Windows machine inside your network (or on a VM in a private cloud network). It acts as an outbound proxy: Power BI Service sends query requests to the gateway, the gateway executes queries against the data source and returns results. All communication is outbound from the gateway — no inbound firewall rules are required.
Two gateway modes:
**Standard mode (On-Premises Data Gateway)**: used by Power BI Service, Azure Analysis Services, Power Apps, and Power Automate. Supports multiple users and data sources. This is the enterprise gateway mode.
**Personal mode (On-Premises Data Gateway — Personal mode)**: used by a single user for personal development. Not suitable for production deployments — does not support clustering, is not shared across users, and runs as a desktop application rather than a Windows service.
For any production Power BI environment where multiple users refresh datasets or publish reports, standard mode is the only appropriate choice.
Gateway Clusters for High Availability
A single gateway node is a single point of failure. Gateway failures cause all scheduled refreshes and live query connections to fail until the gateway is restored. For production environments, gateway clusters provide high availability.
**Cluster configuration.** A gateway cluster is created by installing two or more gateway nodes and registering additional nodes to an existing cluster. Power BI load-balances requests across available cluster members. If one node fails, requests are automatically routed to other nodes.
**Sizing cluster nodes.** Gateway nodes should be dedicated to the gateway function — not shared with other workloads. Each node's CPU, memory, and network bandwidth affect gateway throughput. For environments with high query volume or large dataset refreshes, adequately sized gateway nodes prevent the gateway from being a performance bottleneck.
As a starting point: 4–8 CPU cores, 16–32 GB RAM per gateway node for moderate workloads. High-volume environments with multiple simultaneous dataset refreshes of large datasets benefit from more CPU and memory. Monitor CPU and memory utilisation on gateway nodes and scale up if either is consistently above 70–80% during peak refresh windows.
**Node update strategy.** Gateway clusters must be kept up to date — Microsoft releases monthly gateway updates. With a cluster, nodes can be updated one at a time (the cluster continues serving requests from remaining nodes during the update), enabling zero-downtime updates. Without a cluster, any gateway update requires a maintenance window.
Data Sources and Credential Management
Each data source that the gateway connects to is configured in the Power BI Service gateway management UI (or via the Power BI REST API). Configuration includes:
- The connection string (server name, database name)
- Authentication credentials (username/password, Windows authentication, OAuth, service principal)
- Privacy level (private, organisational, public) — affects query folding behaviour in Power Query
**Credential security.** Credentials stored in the Power BI Service are encrypted. For environments with strict credential management requirements, configure service account credentials with minimal permissions — read-only access to specific schemas, not DBA-level credentials. Rotate credentials on a schedule and update the Power BI data source configuration when credentials change.
**Single Sign-On (SSO).** For some data sources (Azure SQL, SQL Server with Windows authentication, SAP HANA), the gateway supports SSO — passing the report viewer's identity to the data source rather than using a shared service account. SSO enables row-level security that is enforced at the database level based on the viewer's identity. SSO configuration requires Kerberos delegation setup on the gateway machine, which involves Active Directory configuration that the network/infrastructure team must support.
Scheduled Refresh Configuration
Most dataset refreshes in Power BI Service run through the gateway. Refresh failures often trace to gateway configuration issues:
**Refresh timeout.** The default timeout for a dataset refresh is 2 hours. For large datasets, this may be insufficient. Configure the dataset's refresh timeout in Power BI settings (Premium capacity) or request an extension from your Power BI administrator.
**Parallel refresh.** When multiple datasets are scheduled to refresh simultaneously and all connect through the same gateway, the gateway becomes a bottleneck. Stagger refresh schedules to avoid peak contention on the gateway. Use the gateway's performance monitoring (available in the gateway's Event Viewer logs) to identify periods of high concurrent load.
**Connection pooling.** The gateway maintains connection pools to data sources. For SQL-based sources, the pool size determines how many concurrent queries can run. If the pool size is too small, queries queue behind active connections. Configure connection pool sizes in the gateway configuration file (Microsoft.PowerBI.DataMovement.Pipeline.GatewayCore.dll.config) for high-volume data sources.
Performance Optimisation
Gateway performance problems manifest as slow dataset refreshes and slow live query response in DirectQuery reports.
**Gateway machine placement.** The gateway should be in the same network segment as the data sources it connects to. A gateway in the US East Azure region connecting to an on-premises SQL Server in a UK data centre has significant network latency on every query. Minimise hops and distance between gateway and data sources.
**Query folding.** DirectQuery reports and scheduled refreshes that fold queries correctly (push filter and aggregation operations to the data source) transfer less data through the gateway. Reports or transforms where query folding breaks execute large transfers through the gateway. Diagnose query folding using Power Query diagnostics and fix non-folding steps.
**Log monitoring.** The gateway writes detailed logs to the Program Data directory on the gateway machine. The logs record every query, its duration, whether it succeeded or failed, and any errors. Regularly review gateway logs for: consistently slow queries (indicating source-side performance problems), authentication errors (indicating credential configuration issues), and connection failures (indicating network problems).
**Update regularly.** Microsoft releases monthly gateway updates. Older gateway versions may lack performance improvements and bug fixes. Establish a monthly maintenance window for gateway updates as part of standard operations.
For Power BI architecture including gateway deployment, clustering, and enterprise refresh management, our BI strategy consulting covers Power BI at scale — contact us to discuss your Power BI infrastructure requirements.
A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.
Book a Call →