SQL Query Optimisation for Cloud Data Warehouses

The query patterns that cause performance problems in cloud data warehouses — Snowflake, BigQuery, and Redshift — and the optimisation techniques that address them: predicate pushdown, partition pruning, clustering key alignment, CTE materialisation, window function efficiency, and avoiding common anti-patterns that scan more data than necessary.

Cloud data warehouses are remarkably capable, but they do not make all queries fast by default. The query patterns that cause performance problems in Snowflake, BigQuery, and Redshift follow predictable patterns — and the fixes are learnable. This guide covers the optimisation techniques that address the most common slow-query scenarios across the major cloud warehouses.

Understand Your Query's Cost Before Optimising

The first step is establishing what the query is actually doing. Warehouses provide explain plans that show the execution strategy before running, and query profiles that show what happened after.

**Snowflake:** EXPLAIN shows the logical plan; the Query Profile in the Snowflake UI shows the physical execution with per-node timing, rows processed, and spill-to-disk indicators. The most useful indicator in the Query Profile is which operator takes the longest and whether spill-to-disk is occurring (a sign that the query is using more memory than available).

**BigQuery:** EXPLAIN (via EXPLAIN ANALYZE in BigQuery Studio or via the Jobs API) shows estimated and actual bytes processed and slot milliseconds. The Query Execution Details in BigQuery console shows stages and their durations. High bytes processed with low slot time indicates full table scans; high slot time with low bytes indicates compute-intensive operations.

**Redshift:** EXPLAIN before running the query shows the physical execution plan with distribution and sort key alignment information. SVL_QUERY_SUMMARY and STL_QUERY_METRICS contain post-execution timing data. The key indicators are DS_DIST_INNER (redistribution during join), full table scans, and high CPU time.

Predicate Pushdown and Partition Pruning

The single most impactful optimisation for most slow queries: filter early, filter hard.

Cloud warehouses store data in micro-partitions (Snowflake), partitions (BigQuery), or sorted blocks (Redshift). When a query filters on a column used to define these partitions, the warehouse skips partitions that cannot contain matching rows — dramatically reducing the data scanned.

**The pattern:** Ensure WHERE clause filters are on the partition column, and that those filters appear as early in the query as possible.

In Snowflake, clustering keys define micro-partition boundaries. A query filtering on a well-clustered column (e.g., date for a time-series fact table) scans only the relevant micro-partitions. A query filtering only on a non-clustered column must scan all micro-partitions. Snowflake's Automatic Clustering keeps the table organised on the clustering key; queries that filter on other columns will not benefit from partition pruning.

In BigQuery, partitioned tables (on a DATE, TIMESTAMP, or integer range column) enable partition pruning. A query that does not filter on the partition column scans the entire table regardless of other predicates. BigQuery charges by bytes scanned; full table scans on large tables are both slow and expensive. Always filter on the partition column for partitioned tables, and ensure queries use a literal or parameter value (not a subquery result) for the partition filter — some forms of subquery partition filters are not recognised as partition pruning predicates.

In Redshift, sort keys enable zone map filtering. Tables sorted on date_column allow queries filtering on date_column to skip blocks containing no data in the date range. Unsorted rows (inserted after the last VACUUM) do not benefit from zone map filtering.

Join Order and Cardinality

The query planner chooses join order based on table statistics. In most cases, you want the query to join smaller tables to larger ones and filter aggressively before joining.

**Broadcast joins for small tables:** When joining a large fact table to a small dimension table, the warehouse can broadcast a full copy of the small table to every node/worker, allowing the join to happen locally without redistributing the large table. Snowflake and BigQuery do this automatically when the small table is small enough. In Redshift, ALL distribution on small dimension tables achieves the same effect.

**Avoid joining on expressions:** A join condition like ON LOWER(a.email) = LOWER(b.email) prevents the warehouse from using indexes or partition pruning on the email column. Pre-compute the normalised value in a staging model and join on the normalised column.

**Filter before joining:** In subqueries and CTEs, apply WHERE filters before the JOIN rather than after. The warehouse's query planner should handle this in most cases, but explicit pre-filtering in CTEs makes the intent clear and occasionally helps with poorly estimated plans.

CTE Materialisation

CTEs (WITH clauses) are powerful for query readability, but their interaction with query execution depends on the warehouse.

**BigQuery:** CTEs are re-evaluated every time they are referenced in the query. If a CTE is referenced three times, it executes three times. For CTEs that are expensive to compute and referenced multiple times, materialise them as a temporary table or use CREATE TEMP TABLE.

**Snowflake:** CTEs are optimised inline — the query planner can push predicates into CTEs and avoid re-execution. However, for very complex CTEs that are referenced multiple times, explicit materialisation as a transient table sometimes produces better plans.

**Redshift:** CTEs are typically materialised as temporary tables automatically in the execution plan. Check the EXPLAIN output to confirm.

The practical rule: if a CTE contains a GROUP BY or aggregation and is referenced more than once in the outer query, materialise it explicitly.

Window Functions

Window functions (RANK, ROW_NUMBER, SUM OVER, LAG, LEAD) are powerful but can be expensive if not used carefully.

**Partition key alignment:** Window functions with PARTITION BY compute their result within each partition independently. If the PARTITION BY column is the same as the table's distribution key (Redshift) or clustering key (Snowflake), the partition computation can happen locally on each node without data movement. When the PARTITION BY column does not align with distribution, data must be redistributed before the window function can execute.

**Avoid applying window functions to the full table when filtering downstream:** A common pattern — compute a window function over the full table, then filter in the outer query — forces the window function to process all rows even if the outer filter reduces to a small subset. Restructure to filter before the window function where possible:

-- Less efficient: window function on all rows, filter after

select * from (

select order_id, customer_id,

row_number() over (partition by customer_id order by order_date) as order_rank

from fact_orders

) where order_rank = 1

-- More efficient: filter to relevant customers before the window function

select order_id, customer_id,

row_number() over (partition by customer_id order by order_date) as order_rank

from fact_orders

where customer_id in (select customer_id from target_customers)

**Multiple window functions:** When a query uses multiple window functions with the same PARTITION BY and ORDER BY, the warehouse can compute them in a single pass. Window functions with different PARTITION BY or ORDER BY require separate passes. Where possible, consolidate window functions that share partition/order specifications.

Aggregation and GROUP BY

**Avoid SELECT DISTINCT when GROUP BY is more specific:** SELECT DISTINCT scans all rows and removes duplicates across all columns. GROUP BY with explicit aggregation columns is more targeted and allows the warehouse to apply more optimisations.

**Approximate COUNT DISTINCT:** Exact COUNT DISTINCT requires tracking every distinct value seen, which is memory-intensive on high-cardinality columns. For analytics use cases where approximate precision is acceptable, use approximate count distinct functions:

- Snowflake: APPROX_COUNT_DISTINCT

- BigQuery: APPROX_COUNT_DISTINCT

- Redshift: APPROXIMATE COUNT(DISTINCT ...)

On cardinalities above 1M, approximate functions are typically within 1–2% of exact counts with a fraction of the compute cost.

**Pre-aggregate in dbt, not in BI:** Aggregations performed by a BI tool (Tableau, Looker) on every dashboard load consume warehouse compute on every query. For stable aggregations (daily revenue by region), pre-aggregate in dbt mart models and let the BI tool query the pre-aggregated result. The query cost moves from every dashboard load to the dbt pipeline run.

Anti-Patterns to Avoid

**SELECT star in production queries:** SELECT * reads all columns, including those not needed by the query. Columnar warehouses read only the columns referenced in the query — SELECT * forces reading all columns. Always specify the columns you need.

**Correlated subqueries:** A correlated subquery executes once for every row in the outer query. On a million-row outer table, a correlated subquery executes a million times. Replace correlated subqueries with window functions, lateral joins, or pre-aggregated CTEs.

**Implicit type casting in join conditions:** Joining on columns of different types (VARCHAR vs INT, DATE vs TIMESTAMP) forces implicit casting that prevents the warehouse from using sort/distribution alignment. Standardise column types in staging models.

**Non-sargable predicates:** Predicates that cannot be evaluated against partition metadata — WHERE YEAR(order_date) = 2025 instead of WHERE order_date >= '2025-01-01' AND order_date < '2026-01-01' — prevent partition pruning. Use range predicates on date columns, not function calls.

**Ordering large result sets without LIMIT:** ORDER BY on a large result set requires sorting all rows before returning any. Unless you need the full sorted result, add LIMIT to return only the top N rows — the warehouse can optimise to return the top N without a full sort.

Our data architecture consulting practice designs analytics platforms and optimises query performance across Snowflake, BigQuery, and Redshift — contact us to discuss SQL performance for your warehouse environment.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →