A data architecture assessment diagnoses the current state of your data infrastructure — identifying the gaps, risks, and improvement opportunities before you invest in a build programme. Here is what a rigorous assessment covers and what outputs it should produce.
The quick answer
A data architecture assessment is a structured evaluation of your current data infrastructure — what exists, how it is designed, what problems it has, and what it would cost to fix them. It is the starting point for any data platform investment: before you spend $200,000 building a modern data stack, you need to understand what you are building from, what problems you are solving, and what the right sequence of investment is.
A rigorous assessment takes 4–8 weeks and costs $15,000–$40,000 in professional services depending on environment complexity. The output is a current-state report, a gap analysis, and a prioritised roadmap of improvements. Organisations that skip the assessment and go straight to building typically spend 2–3× more than they would have with a well-designed plan.
What a data architecture assessment covers
### Data infrastructure inventory
The assessment catalogues all existing data infrastructure: data stores (warehouses, databases, data lakes, legacy systems), data movement (ETL pipelines, API integrations, file transfers, manual exports), transformation logic (stored procedures, ETL jobs, dbt models, BI tool calculated fields, spreadsheet calculations), and data serving (BI tools, APIs, application databases, reporting exports).
This inventory is often the first time a complete picture of the data environment exists in one place. Most organisations with more than 5 years of data infrastructure have components that are not well-documented — pipeline jobs running on a server that no one is actively maintaining, data sources that analysts know about but that are not in the official data catalogue, transformation logic that lives in an individual analyst's spreadsheet.
The inventory is not just documentation — it identifies risk. Undocumented pipelines, undocumented transformations, and sole-owner data assets are architecture risks that the assessment surfaces.
### Architecture design quality
Beyond cataloguing what exists, the assessment evaluates how it is designed. Specific areas:
**Schema design**: are warehouse tables designed for analytical queries (dimensional models, star schemas) or for operational use (OLTP-style normalised tables that perform poorly for analytics)? Is the schema documented? Are naming conventions consistent?
**Pipeline design**: are pipelines idempotent (can they be re-run without producing duplicates)? Are they monitored? Are errors handled gracefully or do they fail silently? Is the same transformation logic duplicated across multiple pipelines?
**Governance quality**: is there a data catalogue? Are data assets owned? Are quality standards defined and enforced? Is there a semantic layer, or does each BI tool define metrics independently?
**Scalability**: will the current architecture handle 5× the current data volume? 10× the current query concurrency? Where are the bottlenecks?
### Data quality diagnostic
The assessment includes a data quality evaluation: what is the current quality of data in key domains (customer, product, transaction, financial), what quality checks are in place, and where are the quality failures that are affecting business decisions.
Data quality findings are often the most actionable output of the assessment — not because quality problems are uniquely hard to fix, but because they are the failures that are directly costing the business money and credibility in the analytics function.
Common findings: null rates higher than business rules should permit in key columns, duplicate records in master data (customer, product), referential integrity violations between related tables, inconsistent category values (same concept encoded differently in different systems), and stale data (tables that should be refreshed daily but are running days or weeks behind).
### Team and process assessment
The assessment evaluates not just technology but the organisational context it operates in:
**Team structure and skills**: who owns the data infrastructure? What skills does the team have? What roles are missing? Is the team sized appropriately for the scope of the environment?
**Development practices**: is infrastructure-as-code used? Is there version control for transformation logic? Is there a deployment process? Is there monitoring and alerting?
**Governance processes**: are there data ownership assignments? Is there a data stewardship programme? Are quality issues tracked and resolved? Is there a data dictionary?
**Stakeholder satisfaction**: are data consumers (analysts, business stakeholders, product teams) satisfied with data reliability, freshness, and accessibility? What are the most common complaints?
### Regulatory and compliance review
For organisations in regulated industries (financial services, healthcare, insurance), the assessment includes a compliance review:
- What personal data is stored and how is it governed?
- Does the data lineage documentation meet regulatory requirements (BCBS 239, HIPAA, GDPR)?
- Are retention policies defined and enforced?
- Is access control sufficient for audit purposes?
- Are there data handling obligations from contracts or regulations that the current architecture may not satisfy?
What a good assessment output looks like
A rigorous assessment produces four outputs:
**Current state report**: a complete documentation of the existing data infrastructure, organised by domain (ingestion, storage, transformation, serving, governance). Includes architecture diagrams, inventory tables, and observations about design quality.
**Gap analysis**: for each area of the assessment, a structured comparison of current state against the standards that a modern, well-run data environment should meet. Specific, named gaps rather than generalised observations.
**Risk register**: the specific risks identified in the assessment — undocumented dependencies, sole-owner assets, compliance gaps, scalability limits — with an assessment of likelihood and impact.
**Prioritised improvement roadmap**: the recommended improvements, sequenced by business impact and technical dependency. Each improvement is estimated (effort and cost), assigned to a horizon (immediate/90-day/12-month/longer), and explained in terms of the business problem it addresses — not just the technical problem.
The roadmap should be actionable immediately: the organisation receiving the assessment should be able to hand it to an engineering team and start building. Recommendations that require another discovery phase before they can be acted on are a sign of an insufficient assessment.
What a data architecture assessment costs
Assessment costs scale with environment complexity:
**Simple environments** (1–2 warehouses, 5–15 source systems, 50–200 dashboards, 1–5 data team members): $15,000–$25,000, 3–4 weeks.
**Mid-complexity environments** (2–4 data stores, 15–40 source systems, 200–500 dashboards, 5–15 data team members): $25,000–$40,000, 4–6 weeks.
**Large/complex environments** (multiple warehouses or lakehouses, 40+ source systems, 500+ dashboards, 15+ data team members, regulated industry): $40,000–$80,000+, 6–10 weeks.
These ranges reflect external consulting engagement costs. An internal team conducting a structured self-assessment can reduce costs but typically takes longer and benefits from the objectivity that external assessors bring.
For the full pricing breakdown across assessment, build, migration, and retainer engagement types, see data architecture consulting cost.
When to commission an assessment
**Before a major build investment**: if you are planning to spend $100,000+ on a data platform build, cloud migration, or BI platform replacement, a $20,000 assessment that clarifies scope and sequence prevents expensive wrong turns. The assessment pays for itself in avoided rework.
**When data quality issues are unresolved**: if the same data quality complaints recur without resolution, the root cause is architectural, not operational. An assessment identifies the architectural root causes that the team is papering over with workarounds.
**After significant organisational change**: a merger or acquisition, a major data team turnover, or a significant business model change often leaves the data infrastructure misaligned with current requirements. An assessment re-establishes baseline understanding and identifies the gaps.
**Before a new regulatory requirement**: GDPR, CCPA, BCBS 239, HIPAA — compliance programmes that touch data require understanding what data exists and how it flows before controls can be designed. An assessment provides the inventory and lineage documentation that compliance programmes need.
**When a new CDO or VP of Data joins**: the first 90 days of a new data leader's tenure is the natural time for an assessment. It provides an objective current-state baseline and a credible basis for prioritising the first year's investment.
Red flags in assessment proposals
**Scope limited to technology**: an assessment that does not include team/process evaluation and business stakeholder interviews is only a technical audit. The root causes of poor data architecture are often organisational and process-based, not just technical.
**No current-state documentation**: an assessment that produces only recommendations without documenting current state is an opinion, not an assessment. You should receive a documented picture of what you have.
**Generic recommendations**: assessment outputs that could apply to any organisation without modification — "implement a data catalogue," "adopt CI/CD practices," "establish data governance" — without specifics about your environment are not worth the cost.
**No effort and cost estimates on recommendations**: a roadmap without effort estimates cannot be prioritised. Each recommendation should include an estimate of engineering effort and professional services cost.
Our data architecture consulting practice runs structured assessments as the standard starting point for platform investment. Assessments are scope-defined, time-boxed, and produce the four outputs described above. If you are evaluating whether a data architecture assessment is the right starting point for your programme, book a free 30-minute audit — a structured conversation about your current environment and what you are trying to achieve is often sufficient to clarify whether a full assessment is warranted.
A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.
Book a Call →