A cloud data warehouse migration is the process of moving an on-premises or legacy warehouse environment to a modern cloud platform like Snowflake, BigQuery, or Redshift. This guide explains the phases of a migration, the key risks, and what distinguishes migrations that succeed from those that stall.
A cloud data warehouse migration is the project of moving an organization's analytical data environment from a legacy or on-premises system — Teradata, IBM Netezza, SQL Server Analysis Services, Oracle, or a poorly maintained cloud deployment — to a modern cloud data warehouse such as Snowflake, Google BigQuery, or Amazon Redshift. It is one of the highest-impact and highest-risk data infrastructure investments an organization can make.
The stakes are high because data warehouse migrations are long, complex, and organizationally disruptive when done poorly — and transformative when done well. Most organizations that struggle with analytics do so because their infrastructure is too slow, too expensive, or too brittle to support the analytical capability their business requires. A successful migration removes those constraints.
Why Organizations Migrate
**Legacy system cost.** On-premises warehouse systems — Teradata, Netezza, Oracle Exadata — carry high hardware, licensing, and maintenance costs that do not scale down when usage is low. Organizations typically spend millions annually on infrastructure that is used intensively for a few hours per day and sits idle for the rest.
**Performance limitations.** Legacy systems have fixed compute capacity. When demand exceeds the available nodes, queries queue and users wait. Scaling requires hardware procurement, installation, and weeks of lead time. Modern cloud warehouses scale on-demand, in minutes, with no hardware procurement.
**Operational overhead.** Managing on-premises warehouse infrastructure requires dedicated DBA and infrastructure teams. Patching, hardware failures, capacity planning, disaster recovery — all require internal engineering investment that cloud warehouses offload to the vendor.
**Feature gaps.** Modern cloud warehouses offer capabilities that legacy systems lack: semi-structured data support (JSON, Avro, Parquet), native machine learning integrations, serverless options, zero-copy data sharing, time travel, and ecosystem integrations with the modern data stack (dbt, Airflow, Fivetran, Looker).
**Tableau Server to Tableau Cloud migration** creates a parallel pattern: organizations migrating from on-premises Tableau Server to Tableau Cloud often find that a simultaneous data infrastructure modernization is necessary to meet the performance requirements of a cloud BI environment.
The Four Migration Phases
### Phase 1: Assessment and Planning
The assessment phase inventories what exists and produces a migration plan:
- **Inventory:** every table, every stored procedure, every scheduled job, every ETL pipeline, every BI connection that depends on the warehouse
- **Dependency mapping:** which tables and processes depend on which other tables and processes — the order of migration is determined by these dependencies
- **Business criticality classification:** which analytical capabilities are business-critical versus which are low-usage legacy artifacts that do not need to be migrated
- **Data volume and cardinality analysis:** how much data exists, how it is distributed, and what that implies for migration tool selection and timeline
- **Schema and SQL compatibility analysis:** how much of the existing SQL and schema design is directly compatible with the target platform versus requiring rewriting
Assessment typically takes two to four weeks for a medium-sized warehouse environment and produces a phased migration plan with effort estimates and risk flags.
### Phase 2: Infrastructure and Pipeline Setup
Before migrating data, the target infrastructure must be ready:
- Target warehouse provisioned and configured (network, security groups, IAM roles, VPC peering for private connectivity)
- Ingestion pipelines from source systems to the new warehouse established (this is an opportunity to modernize from legacy ETL to a modern EL tool like Fivetran or Airbyte)
- dbt project structure established in the target warehouse
- CI/CD and orchestration tooling (Airflow, Prefect) connected to the new environment
- Development, staging, and production environments separated with appropriate access controls
### Phase 3: Data Migration and Transformation Rewrite
The migration phase moves historical data and rewrites transformation logic:
- **Historical data migration:** bulk loading historical data from the legacy warehouse to the target, typically using cloud object storage (S3, GCS) as an intermediary
- **Incremental replication:** establishing live replication (CDC or batch extraction) from source systems to both the legacy warehouse (keeping it current during parallel running) and the new warehouse
- **SQL rewrite:** converting stored procedures, ETL scripts, and data models from legacy SQL dialects (Teradata SQL, Netezza SQL, T-SQL) to the target platform's dialect. This is often the largest single effort item in a migration.
- **BI connection migration:** reconnecting Tableau workbooks, Power BI reports, and other BI tools from the legacy warehouse to the new warehouse
### Phase 4: Validation and Cutover
Validation confirms that the new environment produces the same results as the legacy environment before cutover:
- **Row count reconciliation:** every table in the new warehouse has the same row count as the corresponding table in the legacy warehouse
- **Metric validation:** the 15-20 most critical business metrics produce identical results from new and legacy warehouse queries
- **Performance benchmarking:** key query patterns execute within acceptable latency on the new platform
- **User acceptance testing:** business stakeholders and key dashboard users validate that their analytical tools work correctly against the new warehouse
- **Cutover planning:** the specific sequence of steps to cut all live traffic from the legacy warehouse to the new warehouse, with a documented rollback plan if the cutover fails
What Makes Migrations Fail
**Underestimating SQL rewrite complexity.** The volume of stored procedures and ETL scripts in legacy warehouses is almost always larger than remembered. Organizations regularly discover tens of thousands of lines of SQL that need to be reviewed, converted, tested, and validated — often undocumented.
**Insufficient parallel running period.** Cutting over immediately after data migration, without a parallel running period where both warehouses produce the same outputs, means discovering migration errors in production. Parallel running — running both warehouses simultaneously and comparing outputs — should be a defined phase of the project.
**Migrating everything.** Some legacy warehouse artifacts have not been used in years. Migrating them adds cost and complexity without adding value. Assessment should identify what to migrate and what to retire.
**Treating migration as a technical project, not a change management project.** BI users whose dashboards change behavior, analysts whose SQL needs to be rewritten for a new dialect, and business stakeholders who lose access during cutover will all generate organizational friction. Stakeholder communication and training are project deliverables, not afterthoughts.
Our data architecture practice manages cloud data warehouse migrations — from assessment through validation and production cutover — for organizations modernizing their analytics infrastructure. Contact us to discuss your migration project.
A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.
Book a Call →