Streaming Analytics Architecture: When Real-Time Data Processing Is Actually Necessary

Streaming analytics processes data as it is generated rather than in periodic batches. The architecture is significantly more complex and expensive than batch processing — which makes it important to distinguish the use cases that genuinely require streaming from those that are adequately served by well-designed batch pipelines.

Streaming analytics processes data as it is generated rather than accumulating it in batches for periodic processing. The architecture is significantly more complex and more expensive to build and operate than batch processing — which makes it critically important to distinguish the use cases that genuinely require streaming from those that are adequately served by well-designed batch pipelines. The streaming vs. batch decision is too often made on technical enthusiasm rather than business requirement.

When Streaming Is Actually Required

Streaming is required when the latency between event occurrence and actionable insight materially affects business outcomes. Three categories of use case:

**Time-sensitive interventions** — fraud detection, anomaly detection in IoT or manufacturing systems, and content recommendation in real-time sessions. A payment fraud detection system that processes transactions in minutes rather than seconds allows fraudulent transactions to complete. A recommendation engine that can only respond to user behaviour from prior sessions cannot personalise within-session recommendations. The window for intervention is seconds; batch processing cannot close that window.

**Operational monitoring and alerting** — production system health monitoring, SLA violation detection, and customer-facing error rate tracking. When a service is degrading, the business cost of discovering the problem 4 hours later via a daily batch job is very different from discovering it 30 seconds after the degradation begins. For systems where downtime or degradation is costly, streaming monitoring is justified.

**Continuous aggregations with low-tolerance latency** — some analytics use cases require aggregations that are always fresh: a live transaction counter for a trading platform, an active user count for a consumer application, real-time inventory levels for a fulfilment system. These are continuous aggregations that cannot be served from a batch pipeline without the business accepting periodic staleness.

For every other analytics use case — understanding last week's performance, identifying trends, evaluating cohort behaviour — batch processing with 1–4 hour latency is sufficient and should be used.

The Streaming Architecture Stack

A streaming analytics architecture typically consists of:

**Event sources** — systems that produce events: application servers writing to message queues, IoT devices publishing to MQTT brokers, databases publishing change events via CDC. The event source layer is responsible for producing events reliably and at the required volume.

**Message broker** — the event transport layer that decouples producers from consumers and provides durable, ordered event storage. Apache Kafka is the dominant choice for high-throughput systems; AWS Kinesis and Google Pub/Sub are managed alternatives. The broker holds events for a configured retention period (typically 24 hours to 7 days), allowing consumers to process at their own rate and replay events if processing fails.

**Stream processing layer** — the compute that reads events from the broker, applies transformations, joins, aggregations, or filters, and produces outputs. Apache Flink is the dominant open-source stream processor; Kafka Streams is a lighter-weight option for simpler transformations within the Kafka ecosystem; Spark Structured Streaming is appropriate for organisations already invested in Spark. The stream processing layer requires careful state management — maintaining running aggregations across a continuous event stream requires state that must be persisted and recovered on failure.

**Output sinks** — where stream processing results are written: a low-latency database (Redis, Apache Druid, ClickHouse) for real-time dashboard queries; the data warehouse (via a streaming insert API or micro-batch load) for joining with historical data; or downstream systems (CRM, notification services, operational databases) for action delivery.

Windowing and State Management

Stream processing operates on windows — time-bounded segments of the event stream. The window definition determines what events are grouped together for aggregation:

**Tumbling windows** are fixed, non-overlapping time buckets: aggregate all events in each 5-minute window independently. They are the simplest window type and appropriate for regular interval reporting (events per minute, errors per hour).

**Sliding windows** overlap: a 5-minute window that advances every 1 minute means each event is included in five successive windows. Sliding windows produce smoother trends but require more computation.

**Session windows** group events by user activity, with a gap threshold defining session boundaries. All events within 30 minutes of each other belong to the same session; a gap of more than 30 minutes starts a new session. Session windows are irregular in duration and require more complex state management.

Late-arriving events — events that arrive after the window they belong to has already been processed — are a fundamental challenge in streaming systems. Every streaming processor offers mechanisms for handling late events (watermarks in Flink, allowed latency in Spark Streaming), and the handling strategy requires explicit design decisions: allow a grace period for late events, update previously emitted window results when late events arrive, or discard late events and accept the small count discrepancy.

The Hybrid Architecture

Most production data architectures combine streaming and batch processing in a hybrid design. Streaming handles the latency-sensitive path; batch handles historical analysis and complex transformations that are not required in real time.

The Lambda architecture is the classical hybrid: the streaming layer (speed layer) produces approximate, low-latency results; the batch layer produces exact, high-latency results; a serving layer merges both. Lambda is operationally complex because the same business logic needs to be maintained in two separate codebases.

The Kappa architecture simplifies by using streaming as the only processing path, with reprocessing of historical events from the event log when batch-style historical analysis is needed. This works well when the event broker retains events long enough to support reprocessing requirements. It reduces operational complexity at the cost of requiring the streaming infrastructure to handle all use cases, including those that do not benefit from streaming.

The practical recommendation for most organisations is to start with batch and add streaming selectively for the specific use cases that demonstrably require low latency. Streaming the wrong things adds complexity and cost without delivering business value; streaming the right things produces outcomes that genuinely could not be achieved with batch.

Our data architecture and cloud engineering practice designs streaming analytics infrastructure for organisations with genuine real-time requirements — contact us to discuss your streaming analytics architecture.

Get your data architecture audit in 30 minutes.

A former Microsoft data architect audits your data foundation, identifies your top priorities, and sends you a written plan. Free. No pitch.

Book a Call →