Snowflake Performance Tuning: Warehouse Sizing, Query Optimisation, and Cost Management

Snowflake performance problems and cost overruns have the same root causes: warehouses sized without a workload model, queries that scan more data than necessary, and usage patterns that accumulate credit spend without proportionate query performance. Systematic tuning addresses each layer.

Snowflake performance problems and cost overruns typically share the same root causes: warehouses sized without a workload model, queries that scan more data than necessary, and usage patterns that accumulate credit spend without proportionate query performance. Systematic tuning addresses each layer — virtual warehouse configuration, table clustering, query structure, and credit management — producing both faster queries and lower costs, since the two are directly related in Snowflake's architecture.

Understanding Snowflake's Credit Model

Snowflake charges on compute (virtual warehouse credits) and storage. Virtual warehouse credits are consumed per second of active warehouse time at a rate proportional to the warehouse size. An X-Small warehouse consumes 1 credit per hour; an X-Large consumes 16. Warehouses auto-suspend when idle and auto-resume on the next query — credits are charged only for the period between resume and suspend.

This model means that warehouse size is the primary lever for both performance and cost. A larger warehouse runs queries faster but costs more per query if it has spare capacity. A smaller warehouse costs less per credit but may run queries slower or queue them.

The credit consumption pattern that produces unexpected bills:

**Warehouses that do not auto-suspend** — if a warehouse is configured without auto-suspend, it runs continuously, consuming credits regardless of whether queries are running. Auto-suspend should be enabled with a short timeout (60–300 seconds for interactive warehouses) unless the use case requires persistent cache state (some BI tool connection patterns benefit from longer suspend times).

**Warehouse size without workload analysis** — sizing a warehouse based on preference rather than actual workload characteristics. A warehouse sized one level too large doubles the credit rate without improving performance for queries that are not compute-bound.

Warehouse Sizing Strategy

Virtual warehouse size affects performance through two mechanisms: parallelism (larger warehouses can process more data in parallel, reducing query execution time) and local cache size (larger warehouses have more local disk cache for recently accessed data, reducing repeated reads from remote storage).

The correct sizing process:

**Identify the workload** — what queries run on this warehouse? What are their characteristics: long-running transforms or short interactive queries? High concurrency or sequential execution? Data volume per query? Each workload type has different optimal sizing.

**Run the representative workload at multiple sizes** — Snowflake's query history allows comparing performance across warehouse sizes. For a long-running transform, test at the current size and one size larger; if the larger size reduces query time by more than the cost increase, it may be worth upsizing.

**Configure concurrency** — for interactive BI workloads with many concurrent users, concurrency scaling (auto-scaling across multiple clusters of the same size) provides better performance at peak load than a single larger warehouse. Multi-cluster warehouses scale out rather than up, handling concurrency without increasing single-query resource allocation.

**Use separate warehouses for separate workloads** — ETL/ELT transforms and interactive BI queries should use separate warehouses. Transforms often require sustained high throughput (benefiting from larger warehouses); interactive BI requires low latency for simple queries (benefiting from smaller warehouses with fast resume). Running both workloads on a single warehouse leads to resource contention and inappropriate sizing for both.

Table Clustering for Query Performance

Snowflake stores data in micro-partitions of approximately 50–500 MB of compressed data. Each micro-partition stores min/max statistics for each column. When a query includes a filter predicate, Snowflake uses these statistics to skip micro-partitions that cannot contain matching values — this is called micro-partition pruning.

By default, data is stored in the order it was inserted. If queries frequently filter on a column that is not correlated with insertion order (region, product category, status), the data for each filter value is scattered across many micro-partitions. The warehouse must read many micro-partitions to find the matching rows.

Clustering reorders data so that similar values of the clustering key are co-located in adjacent micro-partitions. After clustering, a query filtering on the clustering key skips the micro-partitions that contain non-matching values.

Clustering is beneficial when:

- The table is large (more than a few hundred GB)

- Queries frequently filter or group on the same column(s)

- The clustering column has reasonable cardinality (not a boolean, but not so high that no two rows share a value)

Common clustering keys: date (for time-series tables), customer ID (for tables queried by customer), region or country (for tables queried geographically).

Clustering has a cost: Snowflake automatically maintains clustering by running background cluster maintenance jobs that consume credits. The credit cost of cluster maintenance should be compared to the query credit savings from reduced scanning.

Query Optimisation for Snowflake

Several Snowflake-specific considerations beyond general SQL optimisation:

**The 'result_scan' function and caching** — Snowflake caches query results for 24 hours. Identical queries (same SQL text, same data, same user) return from cache without consuming warehouse credits. Parameterised queries that generate different SQL text on each run bypass the cache. For repeated reports with the same filter values, standardising query text to enable cache hits reduces credit consumption.

**Materialised views for expensive aggregations** — Snowflake materialised views are automatically maintained as the underlying data changes. For aggregations that are queried frequently but change slowly, a materialised view pre-computes the result and serves it without recomputation on each query. The maintenance cost (credits consumed to keep the materialised view current) must be less than the savings from serving queries from the pre-computed result.

**Lateral flatten for semi-structured data** — Snowflake's native support for VARIANT (JSON, Avro, Parquet) data includes the LATERAL FLATTEN function to expand arrays and objects. Querying deeply nested JSON through multiple levels of LATERAL FLATTEN produces queries that are expensive and hard to maintain. Flattening semi-structured data into typed columns at the staging layer (in the ELT pipeline) produces better query performance than querying VARIANT columns in the reporting layer.

**Avoid unnecessary DISTINCT and GROUP BY** — DISTINCT and GROUP BY operations require sorting or hashing all input rows before deduplication. For tables without duplicates, applying DISTINCT adds sort cost without benefit. Verify that upstream models produce non-duplicated output rather than applying DISTINCT defensively.

Cost Monitoring and Management

Snowflake provides warehouse credit consumption reporting in the SNOWFLAKE.ACCOUNT_USAGE schema. Key monitoring queries:

Credit consumption by warehouse per day — identifies which warehouses drive the most spend. Warehouses with high consumption and low query volume may be oversized or not suspending correctly.

Query history by credits consumed — identifies the most expensive individual queries. The top 20 queries by credit consumption are usually the highest-value targets for optimisation.

Spillage to remote storage — queries that exceed local warehouse memory spill intermediate data to remote storage, dramatically increasing query time. The QUERY_HISTORY view records 'bytes_spilled_to_remote_storage'; high spill indicates queries that would benefit from a larger warehouse size or query restructuring to reduce intermediate result size.

Resource monitors alert when credit consumption exceeds a defined threshold within a period, preventing unexpected bills from runaway queries or misconfigured warehouses.

Our cloud engineering and data architecture practice optimises Snowflake environments for both performance and cost — contact us to discuss Snowflake performance tuning for your organisation.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →