BlogCloud Engineering

Snowflake Architecture: Understanding Virtual Warehouses, Storage, and the Service Layer

James Okafor
James Okafor
Data & Cloud Engineer
·June 29, 202712 min read

Snowflake's architecture is genuinely different from traditional relational databases and even from other cloud data warehouses. Understanding how Snowflake actually works — the storage layer, the virtual warehouse compute layer, the cloud services layer, and how they interact — is the foundation for making effective architecture, performance, and cost decisions.

Snowflake's architecture separates three layers that are tightly coupled in traditional databases: storage, compute, and cloud services. This separation drives most of Snowflake's distinctive behaviour — the ability to scale storage and compute independently, to run multiple compute clusters against the same data simultaneously, and to charge for compute only when it is running. Understanding how these layers work individually and how they interact is the foundation for making effective decisions about Snowflake configuration, performance, and cost.

The Storage Layer

Snowflake stores all data internally in a columnar, compressed format called micro-partitions. Micro-partitions are immutable files stored in cloud object storage (S3 for AWS deployments, Azure Blob Storage, GCS) managed by Snowflake — not in storage you provision. You do not manage this storage directly; Snowflake handles file layout, compression, and clustering transparently.

Each micro-partition is 50-500 MB of uncompressed data, stored compressed and columnar. The columnar storage format aligns with how analytical queries access data: a query that reads only 5 of 100 columns reads only the data in those 5 columns, not the full row width. Snowflake's automatic compression (using LZO, Zstandard, or other codecs depending on data characteristics) further reduces storage volume — typical compression ratios are 3-5x for typical analytical data.

**Micro-partition metadata**: Snowflake maintains metadata about each micro-partition: the minimum and maximum value of each column within the partition. During query execution, Snowflake uses this metadata to skip micro-partitions that cannot contain relevant data for a query's filter conditions (micro-partition pruning). A query filtering for customers in the UK can skip micro-partitions where the minimum and maximum country values exclude UK — without reading the actual data.

**Automatic clustering**: By default, data in a Snowflake table is stored in the order it was inserted, which means micro-partition pruning is effective only if data was loaded in the order of your most common filter columns. For tables where queries consistently filter on a dimension that does not align with insert order (for example, filtering by country on a table loaded in date order), Automatic Clustering (an ongoing background process that re-clusters micro-partitions in the specified order) improves pruning efficiency at the cost of background clustering credits.

**Time Travel**: Snowflake maintains previous versions of micro-partitions for the duration defined by the table's DATA_RETENTION_TIME_IN_DAYS parameter (0-90 days). This enables Time Travel: querying the state of a table at any point within the retention window using AT (timestamp) or BEFORE (statement ID) syntax. Time Travel is enabled by default; the storage cost is proportional to the rate of data change.

**Zero-copy cloning**: Snowflake's CLONE syntax creates a new database object that references the same underlying micro-partitions as the source. No data is physically copied at clone time; the clone appears to contain a full copy of the data instantly. As writes to either the original or the clone diverge from the shared history, new micro-partitions are created for only the diverging data. Zero-copy cloning makes instant dev/test environment creation practical.

The Virtual Warehouse Layer

Virtual warehouses are the compute layer. A virtual warehouse is a named cluster of compute resources (CPU, memory, local SSD cache) that executes SQL queries against data in the storage layer. Virtual warehouses are:

**Independent from storage**: Multiple virtual warehouses can read the same data simultaneously. A data engineering warehouse running heavy transformations does not compete with an analytics warehouse serving interactive dashboard queries, even when both are reading the same tables.

**Elastic**: Virtual warehouses can be scaled up (to a larger size, providing more CPU and memory for a single query) or scaled out (multi-cluster configuration, adding additional clusters for parallel query concurrency). Scale up to reduce query duration for complex queries; scale out to increase concurrency capacity.

**Billing by the second**: A virtual warehouse is billed for each second it is running, at a rate proportional to its size. A warehouse that is suspended consumes no compute credits. Auto-suspend pauses the warehouse after a configurable idle period; auto-resume starts it automatically when a new query arrives.

**Warehouse sizes**: XS (1 server), S (2), M (4), L (8), XL (16), 2XL (32), 3XL (64), 4XL (128 servers). Each step up approximately doubles compute capacity and doubles the per-second credit consumption rate. Larger warehouses execute individual complex queries faster; smaller warehouses execute simple queries at the same speed at lower cost.

**Local SSD cache**: Each virtual warehouse node has local SSD storage used as a data cache. When a query reads a micro-partition, the warehouse caches it locally. Subsequent queries that access the same micro-partition may read from cache rather than object storage, improving query performance significantly for repeated access patterns. The cache is specific to the warehouse — queries on different warehouses do not share cache. This is one reason to consolidate similar workloads on the same warehouse when possible.

The Cloud Services Layer

The cloud services layer is the "brain" of Snowflake: it handles query compilation, optimisation, metadata management, authentication, access control, and transaction management. Unlike the compute layer, it runs continuously as a shared Snowflake-managed service and is not directly visible to users.

**Query parsing and optimisation**: When you submit a SQL query, the cloud services layer parses it, validates syntax and access permissions, and generates an execution plan. The query optimiser determines which micro-partitions to read (via metadata pruning), the join order, and whether to use any materialized results.

**Query result cache**: The cloud services layer maintains a query result cache: if an identical query is submitted within 24 hours and the underlying data has not changed, Snowflake returns the cached result without executing the query against the virtual warehouse. This is a zero-compute operation — it does not consume warehouse credits. High-traffic BI workloads that run the same queries repeatedly benefit substantially from this cache.

**Metadata management**: All table schema, micro-partition metadata, Time Travel history, and transaction logs are managed by the cloud services layer. DDL operations (CREATE TABLE, ALTER TABLE) are metadata-only operations that complete in milliseconds regardless of table size.

**Transactions**: Snowflake supports full ACID transactions. The cloud services layer coordinates transaction management across concurrent reads and writes, ensuring consistent visibility and preventing conflicts. Unlike storage layer operations, transaction state is maintained in the cloud services layer.

How the Layers Interact

When you run a query:

1. The cloud services layer receives the query, validates it, looks up metadata, and produces an execution plan

2. If the query result is in the query result cache and the data has not changed, the result is returned immediately — no virtual warehouse involved

3. Otherwise, the virtual warehouse executes the query: it reads the relevant micro-partitions from object storage (or from local SSD cache if cached), executes the query plan, and returns results to the cloud services layer

4. Results are returned to the client

The physical separation of layers means operations are clean:

- Scaling the virtual warehouse (resizing, suspending, resuming) does not affect the data or the service layer

- Multiple virtual warehouses reading the same data produce no contention — storage access is shared, not exclusive

- Table DDL operations complete instantly because they are metadata changes only; no data movement is required

Our cloud engineering and data architecture practice designs Snowflake environments from architecture to cost optimisation — contact us to discuss your Snowflake architecture requirements.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →