Feature Store Architecture: Centralising ML Feature Engineering for Consistency and Reuse

A feature store is the infrastructure layer that centralises the computation, storage, and serving of features used in machine learning models. Without one, feature engineering is duplicated across teams, training-serving skew is endemic, and reusing features across models requires rediscovering and re-implementing logic that was already built somewhere else. For organisations with more than one or two ML models in production, the absence of a feature store is typically the largest single source of ML infrastructure waste and model reliability problems.

What a Feature Store Solves

The core problems a feature store addresses:

**Training-serving skew** is the most serious. When features are computed independently for model training (typically by a data science team using warehouse SQL or Python) and model inference (typically by an engineering team serving predictions via API), subtle differences in computation accumulate into meaningful discrepancies. Different null handling, different time zone conversions, different aggregation windows. The model receives different inputs at inference than it was trained on, and the degradation is nearly invisible — it shows up as unexplained performance degradation in production metrics, not as obvious errors.

A feature store solves this by centralising feature computation: the same code produces the same feature values for both training and inference. Training data is assembled by reading historical feature values from the feature store's offline store; inference data is served from the feature store's online store. The computation logic is defined once.

**Feature duplication and inconsistency** occurs when multiple data scientists build similar features independently. Team A computes a 30-day trailing purchase count for their recommendation model; Team B computes a monthly purchase frequency for their churn model. These are the same feature with different names and potentially different edge case handling. When the underlying data changes, both implementations need to be updated — and one is likely to be missed.

A feature store makes features discoverable: teams can find existing features before building new ones. The feature catalogue includes descriptions, computation logic, and ownership, making it possible to determine whether an existing feature meets a new model's requirements before investing in reimplementation.

**Point-in-time correctness** for training data is complex to implement correctly in ad-hoc analytics environments. Each training example needs features computed as of the prediction time — not as of today, and not leaking future information. Implementing this correctly requires explicit temporal join logic that is easy to get wrong and expensive to debug when it produces data leakage.

Feature stores implement point-in-time correct retrievals as a primitive: given a list of entities with timestamps, retrieve the feature values that were current at each timestamp. This abstracts the temporal join logic into the infrastructure layer where it can be tested and validated once.

The Offline and Online Store Architecture

Feature stores have two storage layers:

**The offline store** is a data warehouse (Snowflake, BigQuery, Redshift) or data lake (S3, GCS) that stores historical feature values. It is used for training data retrieval: assembling the feature vectors for thousands or millions of historical observations for model training. Query latency is acceptable in the seconds-to-minutes range; throughput matters more than latency. The offline store is also used for batch inference when predictions can be computed in advance (daily churn scores, weekly segment assignments).

**The online store** is a low-latency key-value store (Redis, DynamoDB, Cassandra) that stores the most recent feature values for active entities. It is used for real-time inference: when a prediction request arrives, the serving layer retrieves the current feature values for the entity from the online store, assembles the feature vector, and scores it against the model. Latency requirements are typically under 20 milliseconds; throughput requirements are bounded by the volume of inference requests.

The synchronisation between offline and online stores is a critical infrastructure component. Feature values computed in the offline store need to be written to the online store on a schedule that matches the freshness requirement for inference. A churn score that needs to reflect usage from the last 7 days requires the online store to be updated at least daily with the trailing-7-day usage metric.

Feature Definition and Governance

A feature store is not just storage — it is a governed catalogue of defined features. Each feature has:

**A computation definition** — the transformation applied to source data to produce the feature value. For warehouse-based features, this is typically SQL; for streaming features, a streaming computation; for features requiring Python transformations, a UDF or transformation function.

**A data source reference** — the source table or event stream that the feature is computed from. This enables lineage tracking: for any feature, it is possible to trace back to the source data it was derived from.

**A refresh schedule** — how frequently the feature value is recomputed. Features that are slow-changing (account age, industry classification) may refresh weekly; features that capture recent behaviour (last 7-day activity count) need daily or near-real-time refresh.

**Ownership and documentation** — the team responsible for the feature, its definition in business terms, and any known limitations or edge cases. This metadata makes the feature catalogue useful for discovery.

When to Build vs. Buy

The feature store landscape includes managed products (Feast, Tecton, AWS SageMaker Feature Store, Databricks Feature Store) and the option to build a minimal implementation using existing data infrastructure.

For most organisations, the build-minimal approach is most practical: use the existing data warehouse as the offline store (it already stores feature tables), a managed Redis or DynamoDB instance as the online store, and a lightweight feature registry (a metadata table in the warehouse or a simple YAML-based definition file) as the catalogue. This covers the core use cases — training data retrieval with point-in-time correctness, online serving from the online store, and feature discovery — without adopting a full managed platform.

The decision to adopt a managed feature store platform is warranted when: there are more than 10–15 active models in production; multiple teams are independently building and maintaining features; streaming features with sub-minute freshness requirements are needed; or the engineering cost of maintaining a bespoke solution exceeds the platform cost.

Our data architecture practice designs ML infrastructure including feature stores for organisations scaling beyond the first few production models — contact us to discuss your feature store architecture.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →