SQL Query Optimisation: Writing Queries That Run Fast Against Large Data Sets

Most SQL performance problems are not mysterious — they are the predictable result of query patterns that the query planner cannot optimise. Understanding how the warehouse executes queries, what the execution plan reveals, and which SQL patterns are consistently expensive makes the difference between queries that run in seconds and queries that run for hours.

Most SQL performance problems are not mysterious — they are the predictable result of query patterns that the query planner cannot optimise effectively. Understanding how a column-oriented warehouse executes queries, reading execution plans, and recognising the query structures that are consistently expensive is the difference between transformations that run in seconds and those that time out in production.

How Columnar Warehouses Execute Queries

Cloud data warehouses (Snowflake, BigQuery, Redshift, Databricks) use columnar storage: data is stored by column rather than by row. A query that accesses 3 of 50 columns in a table only reads the storage for those 3 columns — the other 47 are not touched. This is the foundation of their performance advantage over row-oriented databases for analytical workloads.

**Predicate pushdown** moves filter conditions into the storage layer so that data is filtered before it reaches the compute layer. When a query filters on a column, the warehouse evaluates whether it can skip data segments that cannot contain matching values (using min/max statistics stored per micro-partition in Snowflake, or per file block in BigQuery) without reading them. Filters on the primary clustering or partitioning key are the most effective — they eliminate the maximum amount of irrelevant data at the storage layer.

**Columnar projection** means the warehouse only reads the columns referenced in the SELECT and WHERE clauses. SELECT * queries force the warehouse to read every column in the table. Writing SELECT with explicit column lists is not just a readability preference — it reduces the amount of data read.

**Micro-partition pruning** (Snowflake) and **partition elimination** (BigQuery, Redshift) skip irrelevant partitions entirely based on filter conditions. The effectiveness of partition pruning depends on whether the filter column is the partition key. A query that filters on a non-partition column cannot use partition elimination and must scan the entire table.

Reading Execution Plans

Execution plans show how the query planner intends to execute a query. Reading them reveals the actual operations and their costs — table scans, joins, aggregations — and identifies where the bottlenecks are.

In Snowflake, the query profile in the UI shows the execution plan graphically: each node represents an operation, and the width of the flow lines indicates relative data volume. Nodes where data volume narrows sharply represent effective filtering; nodes where it stays wide represent full scans or operations that do not filter effectively.

Key indicators of performance problems in execution plans:

**Full table scan with high bytes scanned** — the query is reading the entire table. Check whether a filter could have been applied earlier, whether the filter column should be the clustering key, or whether the query needs restructuring to filter before joining.

**Large intermediate result sizes** — a join that produces more rows than either input table indicates row multiplication, often from a many-to-many join or a missing join condition. Intermediate result sets that are larger than necessary consume memory and slow subsequent operations.

**Nested loop join instead of hash join** — for large tables, nested loop joins are extremely expensive. Modern warehouses should automatically choose hash joins for large table joins, but query structure can prevent this. Non-equi joins (joins on inequality conditions) often force nested loops.

**Expensive sort operations** — sorts on large datasets are expensive. A ORDER BY in a subquery that is then further filtered is sorting more data than necessary; restructuring to filter before sorting reduces the cost.

Query Patterns That Are Consistently Expensive

**SELECT DISTINCT on large tables** — DISTINCT forces a sort or hash deduplication across the full result set before returning rows. If you are using DISTINCT to handle duplicates in a join, the correct fix is usually to fix the join — not to apply DISTINCT as a patch. If DISTINCT is genuinely needed, ensure it operates on as few rows as possible by filtering first.

**CROSS JOINs** — cross joins multiply every row in the left table with every row in the right table. For large tables, this produces result sets that may be orders of magnitude larger than either input. Accidental cross joins (joins on conditions that produce many-to-many relationships) are a common performance issue in complex queries.

**Correlated subqueries** — subqueries that reference a column from the outer query execute once per row of the outer query. For large outer tables, this is equivalent to running the subquery millions of times. Rewriting correlated subqueries as joins or window functions eliminates the per-row execution.

**Row-by-row operations via UDFs** — user-defined functions that operate row-by-row are executed once per row, which eliminates the vectorised execution that makes columnar databases fast. Where possible, replace row-level UDF logic with set-based SQL operations.

**Functions on filter columns in WHERE clauses** — applying a function to a column in a WHERE clause (WHERE YEAR(order_date) = 2024) prevents partition pruning because the warehouse cannot determine which partitions contain matching values without evaluating the function. Rewrite as a range filter (WHERE order_date BETWEEN '2024-01-01' AND '2024-12-31') to enable partition pruning.

Join Optimisation

Joins are the most common source of query performance problems in analytical SQL. The principles:

**Filter before joining** — reduce the size of tables before joining them. A subquery or CTE that filters to the required records before the join reduces the data the join must process.

**Join on the right keys** — join on indexed or clustered columns. Joining on non-clustered columns requires a full scan of both tables to find matching rows.

**Consider join order** — in most warehouses, the query planner determines join order automatically using statistics. When the planner makes a suboptimal choice (visible in the execution plan as unexpected data volumes), explicit join reordering or query hints can correct it.

**Avoid joining to aggregated subqueries when window functions can do the work** — a common pattern is joining a table to a subquery that computes an aggregation (e.g., maximum date per customer). Window functions compute the same result without a separate join.

Clustering and Partitioning

For large tables queried frequently with the same filter columns, clustering (Snowflake) or partitioning (BigQuery, Redshift) is the most impactful structural optimisation.

**Clustering** in Snowflake co-locates rows with the same clustering key values in adjacent micro-partitions. Queries that filter on the clustering key can skip entire micro-partitions without reading them. Clustering is most valuable for tables queried primarily by the same filter column (date, customer ID, region) with a high selectivity filter.

**Partitioning** in BigQuery divides the table into separate physical units by partition key value. Partition elimination is highly effective: a query filtering to a single partition reads only that partition's data, regardless of table size.

Clustering and partitioning impose a maintenance overhead (Snowflake automatically re-clusters tables over time; heavy DML activity in BigQuery requires partition management), but for tables queried frequently with the same filter patterns, the query performance benefit is substantial.

Our data architecture practice reviews and optimises analytics transformation layers for organisations experiencing query performance bottlenecks — contact us to discuss SQL performance for your data environment.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →