BlogData Architecture

How to Conduct a Data Architecture Review: Findings, Recommendations, and Prioritisation

Austin Duncan
Austin Duncan
Managing Director & Principal Data Architect
·June 29, 202713 min read

A data architecture review is a structured assessment of how well your current data infrastructure, pipelines, and analytical layer serve your organisation's analytical needs — and where the gaps are creating cost, risk, or missed capability. This guide covers what a rigorous review looks like and how to turn findings into an actionable roadmap.

A data architecture review is a structured assessment of the current state of an organisation's data infrastructure, transformation layer, analytical tooling, and governance — measured against the organisation's actual analytical requirements. The output is a clear picture of where the current architecture is working, where it is creating problems, and what changes will produce the most value.

Reviews are warranted when: the organisation is planning significant investment in data infrastructure and wants to understand the baseline before spending; recurring analytical quality issues suggest structural problems rather than isolated failures; scaling requirements are approaching the capacity of the current architecture; or a leadership change has created an appetite for an honest assessment of inherited infrastructure.

What a Review Covers

A complete data architecture review has five domains:

**Data ingestion and integration**: How does data get from source systems into the analytical environment? Are integrations current and maintained, or are there stale connectors producing outdated data? What is the latency profile — how old is the data typically when it reaches the warehouse? Are ingestion failures monitored and alerting appropriately? Are there gaps: source systems that produce data the organisation needs but that are not integrated?

**Storage and warehouse architecture**: What databases and warehouses are in use? Are they sized appropriately for current and projected workloads? Is there redundancy (multiple sources of truth for the same data), fragmentation (the same data in many places), or gaps (data that should be in the warehouse but is not)? What is the cost model and is spend appropriate for the value delivered?

**Transformation and data modeling**: How is raw data transformed into analytical-ready form? Is transformation logic documented and tested? Is there a consistent modeling approach (dimensional modeling, dbt layering) or is each analyst producing their own version of the data? Where are there data quality issues, and are they caught before they reach consumers?

**Analytical and BI layer**: What tools are used for reporting and analysis? Are they well-adopted or is there a mismatch between the tool investment and actual usage? Is content governed (certified data sources, consistent definitions) or fragmented (many versions of the same metric)? Are dashboards designed for the users who need them, or are they data-team-produced outputs that do not serve the analytical questions being asked?

**Governance and operations**: How is access managed? Are permissions appropriate and regularly reviewed? Is there audit logging for compliance? How are incidents (extract failures, data quality issues, performance degradation) detected and resolved? Is there documentation for the analytical environment that would allow a new team member to get productive?

How to Conduct the Review

**Start with the analytical questions, not the infrastructure**: Before looking at the architecture, understand what the organisation is trying to do with its data. Interview the primary data consumers — the heads of Finance, Marketing, Operations, Product — and ask: what analytical questions do you need to answer regularly? What questions are you unable to answer today? What do you trust, and what do you not trust?

These conversations establish the requirements the architecture is supposed to serve. Without requirements, there is no baseline for assessment — an architecture can only be evaluated against what it needs to do.

**Inventory the architecture systematically**: Use the API and metadata access for each platform:

- Data warehouse: schema catalogue, table sizes, query history, user access patterns

- ETL/ELT: pipeline inventory, job failure rates, run duration trends

- BI tool: workbook inventory, data source catalogue, user activity data, subscription and alert configuration

- Orchestrator: job history, failure rates, schedule configuration

**Profile quality and reliability**: For each data pipeline, what is the failure rate? For each key metric, how is it defined and how many different definitions exist? For the most-used dashboards, what are the performance profiles — are they slow in ways that affect user experience?

**Interview the data team**: The data team knows where the bodies are buried. Ask: what are the most frequent incidents? What parts of the architecture are fragile? What technical debt is accumulating? What do you wish had been built differently? What are you most worried about?

Prioritising Findings

Reviews typically produce more findings than can be addressed simultaneously. Prioritise using two dimensions:

**Impact**: How much is this finding costing the organisation, in money, time, risk, or missed capability? A finding that affects every analytical user daily has higher impact than one affecting a single analyst monthly.

**Effort**: How much work is required to address this finding? Some findings are quick wins (configuring an alert that is missing, fixing a broken extract schedule). Others are multi-month programmes (migrating a warehouse, rebuilding a transformation layer). Quick wins that have meaningful impact should be completed before beginning complex programmes.

A 2x2 priority matrix (high impact / low effort as immediate priorities; high impact / high effort as planned programmes; low impact / low effort as background work; low impact / high effort as deprioritised or declined) produces a rational sequencing.

The Review Deliverable

A useful review deliverable has three parts:

**Current state assessment**: A clear description of what exists, its quality, and its fitness for purpose. Written for a non-technical audience: executives who need to understand the problem before approving the solution.

**Gap analysis**: Where the current state falls short of requirements, with specific evidence. Vague observations ("the data quality is not good") are unhelpful; specific findings ("the customer dimension has a 12% duplication rate, producing inflated customer counts in all customer-level reports") are actionable.

**Prioritised roadmap**: Specific changes with estimated effort, projected impact, and recommended sequencing. The first 90 days should be quick wins with clear ownership. The 6-12 month horizon should be planned programmes with defined scope. Longer-horizon items should be flagged as requiring further scoping.

Our data architecture practice conducts reviews for mid-market enterprise environments — contact us to discuss a data architecture review for your organisation.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →